Monday, March 9, 2020

How to Use the Harvard Khipu Database files

When I first started studying khipu, I used the khipu data tables that are set up on Excel files. When I started doing a more in-depth study and wanted to search for khipu meeting certain criteria, I found that I needed to use the SQL database provided on the Harvard Khipukamayuq website. Without a background in computer science, and no previous experience with SQL, I quickly gave up. But opening each Excel file took hours. So I learned SQL through the free Coursera course (when you select the course, it will give a few payment options for certificates, but I just click the "Audit this course" option at the bottom, which provides free access to all videos, but no certificate to show employers). The database files are in a format to open with MySQL, which seems to require setup of a server. So I converted them to SQLite format, which can be easily run on any computer, and installed DB Browser for SQLite to input and search the values. I have included setup instructions below for anyone who wishes to do the same. My hope is to spare any would-be khipu fanatics from the months of pain I went through.

Another problem I came up against was making sense of how the database was organized. The Excel files each have three tabs: Heading, Pendant Detail, and Notes. Whereas the database has almost 50 tables. It was overwhelming. To help me write the queries, I prepared a file that acts as a key between the Excel files and the Khipu Database files, explaining where to find everything. You can access the Google Sheets file here. I have provided brief explanations of the tables below, the kind of explanation I wish was included in a Readme file with the KhipuDB_SQL zip file. The database can be downloaded from the Harvard Khipukamayuq website, direct link here.

photo taken by Dr. Gary Urton, found on khipukamayuq.fas.harvard.edu Khipu Gallery
https://khipukamayuq.fas.harvard.edu/images/Miscellaneous/Canuto.jpg
"canuto" refers to the woven or wrapped portion above the khipu pendant cords


Tables in KhipuDB_SQL

Note: anything with "dc" in the title contains explanations of the abbreviations used in the database, like Ascher cord color OB is "moderate olive brown". I don't know the purpose of the tables labeled "flat" or "processed". If you have any additional insight to the tables, please leave a comment and I'll update the blog entry.
  • app_config_data  Creates a database named "collca", (the Quechua word for storehouse) and configures the database.
  • archive_dc   Notes for where each khipu archive was found.
  • ascher_canutito_color   Canutito is a small canuto (see image above), and refers to the khipu that have a woven textile attached to the primary cord, from which the pendant cords originate. This table contains color data for each canutito.
  • ascher_canuto_color   Contains color information for each canuto.
  • ascher_color_dc    Descriptions for the color names used in the database.
    • OB is ISCC number 95
    • To convert R_DEC, G_DEC, and B_DEC to standard RGB values, use the formula R = R_DEC * 255
  • ascher_cord_color   Colors of each cord. Gives the range, because maybe a cord changes color partway through. Provides the color name, but also separates each color code and the operators in between:
    • FULL_COLOR = "KB:W"
    • COLOR_CD_1 = "KB"
    • OPERATOR_1 = ":"
    • COLOR_CD_2 = "W"
  • attachment_dc   Explains what R, V, and H attachment types mean
  • beginning_dc   Explains the abbreviations used for the beginning of primary cords
  • canutito   Canutito data: start and end position, length, and notes
  • canuto_cord_flat   Canuto cord data
  • compare_seriation   I don't know the purpose of this table.
  • cord    Data for each pendant, subsidiary and top cord, as well as knots and markers. Has length, ordinal number, thickness, ply, fiber type, recto/verso attachment, and cord name.
  • cord_cluster  Contains the spacing and grouping of cords, example: "6.0 cm group of   5 pendant(s)       (1 - 5)        space of    1.0 cm"
  • cord_color_notes
  • cord_color_processed  Seems to be used by Carrie specifically. I don't know its purpose.
  • cord_flat   I don't know the purpose of this table. 
  • cord_notes
  • cord_processed  I don't know the purpose of this table.
  • cord_test   I don't know the purpose of this table.
  • cord_top_level_flat  I don't know the purpose of this table.
  • cord_type_dc   Contains no information
  • cord_value   Has the cord value and alternate cord value for cords. It does not have the cord value for all cords, only 293 of the 38,000 cords that have knots.
  • cord_value_components   I don't know the purpose of this table
  • cords_to_explore  Contains no data
  • fiber_dc  Explains the abbreviations for cord fiber types
  • grouping_class_dc  Explains the abbreviations for cord classification (K for knot, T for top cord)
  • khipu_blob_notes  Contains khipu notes. "Blob" is a data entry type in SQL that is not limited by characters.
  • khipu_defaults  Not sure of its purpose, seems like an early version of the primary_cord table
  • khipu_main   Khipu metadata. Museum name, provenance, etc.
  • khipu_notes   Contains khipu notes, but is not as complete as khipu_blob_notes
  • knot   Knot information such as direction (S or Z), type (single or long), value, number of turns
  • knot_cluster   Knot information such as start and end position, number of knots
  • knot_type_dc  Explains the abbreviations used for different knot types
  • pcord_colors_processed  I don't know the purpose of this table
  • pcord_notes  Notes for primary cords. Some, but not all, of these notes are also included in the primary_cord table.
  • pigmentation_dc   Explains the abbreviations used for pigmentation
  • primary_cord  Data regarding the thickness, ply, length, beginning and termination of primary cords
  • primary_cord_attach  Some khipu were found attached to others. This table describes those attachments.
  • primary_cord_processed   Appears to provide the sum values of all pendant cords on each khipu
  • regions_dc   Describes the North/South region of the provenances given in the database
  • routines   I don't know the purpose of this file. Does not contain any input data, nor does it create tables. Perhaps used for internal purposes.
  • structure_dc   Explains the abbreviations used for different structure types
  • termiantion_dc   Appears to be added by mistake
  • termination_dc  Explains the abbreviations used for different termination types
  • urton_khipu_type   Marks whether khipu falls into certain categories including seriated, banded, census, anomalous, etc.
  • x_canuto_color_flat   Contains data regarding canuto lengths, ordinal position, and colors used.



How to convert KhipuDB_SQL files to SQLite format


I created a simple program that builds an SQLite database from the KhipuDB_SQL files. I considered just posting my SQLite database here, but I chose this method for two reasons. First of all, I don't want to run into copyright issues. Second, I don't know when the Khipu Database will be updated.

To run SQL queries, I prefer to use a GUI called DB Browser for SQLite to browse the data and run queries.

To get the Khipu Database in an SQLite format:

  1. Visit the Harvard Khipukamayuq website and click on Khipu Data Tables.
  2. Click on Full Database (SQL) to download the Khipu Database in a zipped file
  3. Extract the file (in Windows, you can just right-click the folder and select Extract All)
  4. Download my KhipuDBtoSQLite conversion program
  5. Follow the instructions to create a database in SQLite on your computer


Having trouble with the KhipuDBtoSQLite conversion program?

I am a novice at computer programming, so if the program doesn't work, here is how you can create the database manually:
  1. Open the folder containing the KhipuDB_SQL files. You should see a bunch of files beginning with collca_ and ending with .sql
  2. Create a new SQLite database named collca
  3. Open each file and make the necessary adjustments. You don't need all the files, just the ones listed below.
  4. For each file, find and replace the following terms (you can open all of the files as tabs in Notepad++ and replace them across all the files at the same time). be sure to include a space before each term and the ( at the end, this will prevent changes from being made in the notes columns. 
    • replace the following list with TEXT(
      •  char(,  varchar(,  tinytext(,  text(,  blob(,  mediumtext(,  mediumblob(,  longtext(,  longblob(
    • replace the following list with NUMERIC(
      •  datetime(,  date(,  datetime(,  time(
    • replace the following list with  INTEGER(
      •  tinyint(,  smallint(,  mediumint(,  int(,  bigint(
    • replace the following list with REAL(
      •  float(,  double(,  decimal(
  5. The following strings should be replaced with a blank
    • DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
    •  DEFAULT '0000-00-00 00:00:00'
    • AUTO_INCREMENT
  6. Replace \n with 2 spaces
  7. \\" with \"
  8. \' with ' ' (that's 2 single quotes)
  9. 195177 with ñ
  10. 195179 with ó
  11. 195169 with é
  12. Now you can open your database. Copy and paste the CREATE TABLE and INSERT INTO codes as queries and run. This will create and fill the tables.

Tables to add to your database:

"archive_dc",
"ascher_canutito_color",
"ascher_canuto_color",
"ascher_color_dc",
"ascher_cord_color",
"attachment_dc",
"beginning_dc",
"canutito",
"canuto_cord_flat",
"cord",
"cord_cluster",
"cord_color_notes",
"cord_notes",
"cord_value",
"fiber_dc",
"grouping_class_dc",
"khipu_blob_notes",
"khipu_main",
"khipu_notes",
"knot",
"knot_cluster",
"knot_type_dc",
"pcord_notes",
"pigmentation_dc",
"primary_cord",
"primary_cord_attach",
"primary_cord_processed",
"regions_dc",
"structure_dc",
"termination_dc",
"urton_khipu_type",
"x_canuto_color_flat"

No comments:

Post a Comment