bell notificationshomepageloginNewPostedit profile

Topic : How can I consistently distinguish among tables, fields, and records in a database? I am describing a database for a scientific publication. The database has many tables, and each table has fields - selfpublishingguru.com

10.04% popularity

I am describing a database for a scientific publication. The database has many tables, and each table has fields (spreadsheet columns) and records (rows).

I spend a lot of time discussing tables and relationships among tables, as well as the meanings of individual fields and records.

So here is the most descriptive "long-form" version:

The "trees" table is related to "apples" table so that each "tree" (record in the "trees" table) can have zero or more records in the apples table, but each individual apple (a record in the apples table) comes from only one tree. An apple record includes a color field, and apples can be either "red", "green", or NULL.

Here is a more realistic representation of my current draft (which requires revision for clarity):

The trees table is related to apples table and each tree may have zero or more apples, but each apple comes from only one tree. An apple's color can be "red", "green", or NULL.

I could define a convention, such as "table names in bold, fields in italics, and records in code. This would reduce the above to:

Each tree in the trees table can be associated with zero or more apple records in the apples table, but each apple comes from only one tree. Each apple has a color field that can be either "red", "green", or NULL.

I would like to know: what can I do to provide a clear and consistent interpretation of the database.

In addition to the sentence-level suggestions, I would appreciate paragraph and section level advice - for example, how to describe a table in one or two paragraphs? How to mapping a relational database to a linear prose - currently they are organized into a logical story in order of decreasing importance of the table.


Load Full (4)

Login to follow topic

More posts by @Karen856

4 Comments

Sorted by latest first Latest Oldest Best

10% popularity

I would not use different fonts or type styles for each of the various "kinds of things". I've read technical books that use such techniques, and even when it's a simple thing like "italics for table names and bold for screen labels", I find myself repeatedly asking, "Wait, was it italics for table names or for screen labels? Which was which?" Once you pass two kinds of things, it just gets confusing. It doesn't help.

Personally I routinely capitalize the names of database objects to highlight them and distinguish them from ordinary uses of the word. Like if I have a table of employees, I'll refer to it as "Employee". You might use italics or bold or whatever. But I use the same convention for all database objects, not try to recreate different typographical conventions for tables, records, fields, relationships, indexes, etc.

In general, when you first introduce an object, you should clearly state what it is. Like, "We will create a Trees table to hold ..." Don't just say "We will create trees ..." because then the reader might get confused whether Trees is a table or, say, a set of records. But once you've introduced each object, I don't think it's necessary to constantly repeat what kind of thing it is. Like write, "We will create a Trees table and an Apples table. Each record in Trees is related to one or more records in Apples ... There is a record in Trees for the "big oak in my front yard". But there are no records in Apples for the big oak in my front yard, because it is not an apple tree ..." Etc.

Think of how you refer to people and things in ordinary narrative. You might introduce Bob and say that he is Sally's brother. You wouldn't find it necessary to always refer to him as "Bob, the person who is the brother of Sally". You surely wouldn't write, "Bob, who is the brother of Sally, walked into the store. Then Bob, the brother of Sally, met Fred. Fred called out, 'Hi, Bob' (addressing Sally's brother) ..." etc. Rather, you would describe him once when you first introduced him, and then generally just use his name after that. If you mention him again after not having talked about him for a while, you might repeat the description to remind the reader who he is. If there's a case where it might be ambiguous, like there are two Bobs, you might add the description to make clear which you are referring to. If something about the description is important in a particular context, you might bring it up again to make clear. Like, Bob has known Sally since she was born because he is her brother. If you're saying something that relies on this fact to make sense, you might clarify, "As he was Sally's brother, Bob knew about ..."


Load Full (0)

10% popularity

My first advice: Don't make your readers' eyes bleed.

Imagining several paragraphs with bold, italics, grayed text, whatever, just makes me creep. Trying to keep in mind what which formatting means, makes me cry. I wouldn't stand a page reading that.

Second advice: Don't use experts' diagrams for laymen.

The diagrams Maura links to are incomprehensible for non-techies. You mentioned in the comments that your audience is not familiar with database terms, so I guess they are non-techies.

So what you can do with formatting, is highlighting the names of the tables, records and fields. E.g. write all names italic (the apple table, the color field) or, because you use English, write them uppercase (the Apple table, the Color field).

Then the reader knows it is the name of a special term (table, record, field) without having to keep in mind which formatting means what (and without dying on eye cancer).

And if you always call a table a table, a record a record and a field a field (what you did in your example) then the reader should be able to follow.

What you can do with diagrams is inventing a layman's version of the experts ones. You already mention spreadsheets, columns and rows and I bet your audience will know these. So a diagram like this one can support understanding:

Well, make it a little bit more shiny ;)


Load Full (0)

10% popularity

My first question, as always in my writing specialty, is who is your audience? It makes a great deal of difference to know the expected sophistication of the audience and their experience with databases.

The "trees" table is related to "apples" table so that each "tree" (record in the "trees" table) can have zero or more records in the apples table, but each individual apple (a record in the apples table) comes from only one tree. An apple record includes a color field, and apples can be either "red", "green", or NULL.

For example, a person with database experience would know what a relational database is and you would only have to describe "tree" as the primary key for the database, located in the Tree table. The Apple table would have a foreign key of "tree" and the color field is often also called an attribute.

If you have a person with NO database knowledge, you have a different problem and you have to ask yourself if you need to describe the database and tables and such or you merely need to describe the relationships and not overload them with talk of tables, rows, etc.

My typical experience with databases, especially complicated ones, is that trying to describe them in paragraph form is painful and I tend to move toward a schema description via document tables and/or a entity-relationship diagram. These are better understood than trying to invent a convention of bold, italic, etc. The diagrams also make the relationships far easier to understand and allow the readers a way to trace relationships without having to (re)-decipher a written explanation.

here's a Wikipedia page on entity-relationship modeling that's a reasonable reference entity-relationship model If you have truly huge databases or very very complex ones, these are typically broken down into subsets according to functionality or objects.

I have seen a quick format used of something like:

Table: Tree
- TreeName (primary key, string)

Table: Apple
- TreeName (foreign key, Tree:TreeName)
- Color (string)

I'm not convinced this helps a lot but it might be a middle-of-the-road solution.


Load Full (0)

10% popularity

Does the publication in question have relevant style guidelines? (I'm assuming not or you wouldn't be asking here.)

In your proposed solution, you are using both formatting and (initial) explicit labeling to convey information: "the trees table" rather than just "trees", for instance. This is good; it reinforces your formatting convention while facilitating scanning of the text. (It would be easy to miss that common words like "trees" are actually proper names in your database.) You may encounter sentences where this verbosity gets in your way and you're tempted to use just the formatting; that can work well for later stages of a description, like in your "but..." clause, but I recommend leading off any new discussion with the more-verbose form.

In addition, in technical writing (as distinct from other prose), it is helpful to be ruthlessly consistent in your use of technical terms, lest a reader see the absence of something as significant. In your example, you say a tree "can be associated with" one or more apples; when talking about the reverse I recommend using the "associated" language rather than the more colloquial "comes from", especially if there actually isn't any directionality to the associations. (That is, since you're talking about databases rather than graphs or pointers, I assume you mean bi-directional joins.) For an audience fluent in databases I wouldn't make this recommendation, but your audience is more general (judging from comments on the question) so give them the extra help.

This is tangential to your question but I couldn't help noticing: is the color field red/green/null, or is the value of the color field red/green/null? That is, does the field describe the column (as you said up front), or does it describe the "cell" (the value of that attribute for a particular record)? I bring this up because it's another example of the kind of precision that's really important, particularly for an audience that isn't already familiar with the domain and its terminology.

Putting all this together, I would revise your paragraph thus:

Each tree record in the trees table is associated with zero or more apple records in the apples table, but each apple is associated with only one tree. Each apple has a color field with a value of "red", "green", or NULL.


Load Full (0)

Back to top