Re: How to convert CSV row to Java object?

From:

Tom Anderson <twic@urchin.earth.li>

Newsgroups:

comp.lang.java.programmer

Date:

Tue, 31 Aug 2010 13:32:08 +0100

Message-ID:

<alpine.DEB.1.10.1008311314370.13100@urchin.earth.li>

On Mon, 30 Aug 2010, markspace wrote:

On 8/29/2010 2:06 PM, Tom Anderson wrote:

On Sun, 29 Aug 2010, markspace wrote:

Most data in a database is relational and therefore can be thought of
as forming tree structures.

I tend to think of relational and hierarchical structures as being
rather different. Are you saying this from the point of view that a
table is trivially a hierarchy, the whole being split into rows which
are then split into fields, or something else?

Something else. Let's see if my idea holds up:

Suppose we have data like those you mentioned, customers and products. You
have a list of customers:

Cust1
Cust2
Cust3

Each customer has presumably some purchases:

Cust1 --+-- Invoice1A
         +-- Invoice1B
         +-- Invoice1C

Each invoice lists various products that were sold:

Cust1 --+-- Invoice1A
                +----- Product_a
         +-- Invoice1B
                +----- Product_b
                +----- Product_c
         +-- Invoice1C
                +----- Product_a
                +----- Product_x

That to me is a tree.

It's a tree to me too. You could draw something very similar for
category/product/SKU as well; it might turn into a DAG if you allow
products to be in multiple categories, but it's broadly treelike.

If SQL forces a tabular format, that's just an artifact of SQL; the
fact that SQL would retrieve this data as a table is immaterial. It
doesn't change the fact that the data itself is a tree.

No, the tree is a way of looking at the data, just as a table is. One
might feel one or the other was more natural, or find one or the other
more practical, but they're ultimately just views. I'm not saying you're
wrong about XML being a good fit for this kind of data - if it is
aesthetically or practically beneficial to treat it as a tree, then XML is
a better choice than CSV - just about this being an essential feature of
the data.

Rather, i'd say the structure follows from the access pattern, rather than
the data itself. Your tree structure is useful for browsing order history
or computing account balance. But what if i wanted to ask questions about
inventory levels or who's buying my products, or to order some products to
be picked from a warehouse? Then, i would probably want a tree like:

Product_a
+ Cust1
   + Invoice1A
   + Invoice1C
Product_b
+ Cust1
   + Invoice1B
+ Cust2
   + Invoice2A
Product_c
+ Cust1
   + Invoice1B
Product_x
+ Cust1
   + Invoice1C
+ Cust3
   + Invoice3A
   + Invoice3B

So that for each product, i can quickly found out how much i've sold to
who, and when, or decide how much needs to be picked, and into what
crates.

The nice thing about a tree model is that it imposes an interpretation on
data. The nice thing about a tabular model is that it doesn't.

Note in my post I did say "relational," trying to imply that there were joins
involved in this scenario. Maybe on that bit I was unclear.

The 'relation' in 'relational' doesn't refer to joins - a 'relation' is a
term from predicate logic which means a predicate function taking several
parameters. For example, if we're talking about invoices, the relation
might be:

is_an_invoice(invoice_number, customer_number, invoice_date)

Which might be true of (Invoice1A, Cust1, Tuesday), but false of
(Invoice1A, Cust2, Tuesday). In practice, a relation can be represented as
a set of same-shaped tuples (of all the values satisfying the predicate),
which looks enough like a table that it's what Dr Codd built his model
upon.

tom

--
It's odd to discover your quips in other people's .sig files. --
Benjamin Rosenbaum