Databases: a new kind of creativity for me
This week in HIST 8500, we learned about databases and tried our hand at designing them using Airbase. I’ll be honest, this is an intellectual challenge I’m going to need some time and practice to wrap my head around. I have a lot of experience in using databases, and had a job for a couple of summers entering data in what I think was Microsoft Access. But I’d never designed a database before, and just those questions of “What am I going to put in this database? What tables am I going to connect? What’s the basic function of this database and what do I want to be able to find out with it?” felt really unfamiliar and difficult to answer. I do think Airtable was pretty easy to use, though, and was a good way to experiment with some of the relevant choices.
One of my first questions was, what am I going to track as an attribute and what am I going to track as an entity and give its own table? One of the things that I was interested in tracing was the connection between education levels and various professions. I initially figured that I would have two tables: one for individuals and one for occupations.
When I was actually entering the data, though, I noticed two things. The first was that a high proportion of the (admittedly super small!) number of adults I was entering had education that the census described as “Elementary School, 4th grade.” The second was that as I continually typed “Elementary School, 4th grade” out, I was making typos and had to go back and fix them.
I remembered from both our readings and our discussions that one of the things a database is supposed to do is eliminate redundancies and places where human error creates a division between, say, “Elementary School, 4th grade” and “Elmentary School, 4th grade.” So I added a third table for education and gave every “Highest grade of school completed” value its own unique identifier.
This, of course, required that I figure out how to link data from one table into another. I’m not going to lie, this resulted in a little bit of trial and error, and I totally deleted those columns and re-entered them a few times.
I did (I think!) ultimately figure it out, and here’s what I ended up with:
Even looking at it, I can see there are some duplicate fields I don’t quite understand. For example, when I linked the tables, it seemed like Education and Jobs both got two fields, while I can see on the schema that there’s an “Individuals” and an “Individuals 2,” and it’s “Individuals 2” that got linked. Why? I’m not sure–when I Googled it, most of what I found was instructions on how to duplicate fields in Airtable. But I did do what I set out to do, which was pretty exciting.
In terms of the questions posed for the assignment, which were “What patterns do you notice of the database so far? What questions could you ask of this database? How could it be expanded?”, one question I had was, why fourth grade? Were there laws at the time requiring everyone to attend until at least fourth grade? That’s not a question the database could answer, but it does arise of the pattern I noticed in the data. Some questions that could perhaps be answered with this database include: what level of education was required of university faculty in 1940? What other jobs on campus and in the local Clemson community were most common, and what levels of education did these different jobs require? Were there any differences in the average levels of education for men and women? What were the most common jobs by gender in the area in 1940? I could definitely expand this just by adding more people to the database, but another thing I could do is add a field to the “job” table clarifying whether a person worked for Clemson College or not.
I wasn’t able to embed my database, unfortunately–I’m increasingly finding that the “embed” block in WordPress gives a lot less flexibility than I, personally, would like, and I wish I could just use plain HTML to put in embed codes rather than relying on WordPress’s pre-existing block embed system for a limited set of things like Twitter or YouTube (which, even when it works, makes adjusting the size an incredible pain in the butt). But that’s a technical problem for another day. Today I’m declaring at least a small victory in the field of database creating and linking my database here.