Csc 1302, Honors Principles of Computer Science II (Fall 2023)
Week 2 (10 September 2023)
Initialize Database from Files; Check Equality of Tuples; Remove Duplicates in Relation
During this week you will write the following methods:- Initialize Database object by reading data stored in several files
in a directory that is given in command line (this method belongs to Database.py)
# Create the database object by reading data from several files in directory dir def initializeDatabase(self, dir): pass
Here is a sample of files available in the directory: drinks (also in .zip format: drinks.zip). The file catalog.dat contains schema information for all the relations in the database and individual .dat files contain the relation instances (tuples).To read data from a file, you can use the following code:
f = open("f.dat","r") s = f.readLine().strip("\n") ... ... f.close()
- Test equality of tuples (this method belongs to Tuple.py)
# Return True if this tuple is equal to compareTuple; False otherwise # make sure the schemas are the same; return False if schema's are not same def equals(self, compareTuple): pass
- Remove duplicate tuples (this method belongs to Relation.py)
## Remove duplicate tuples from this relation def removeDuplicates(self): pass
Download the Driver Programs and implement all the methods in the Python classes. Compile and run the driver programs.
You should see the following output when you run Driver2.py:
Mac-mini:week2 raj$ python3 Driver2.py drinks BAR(BNAME:VARCHAR) Number of tuples:4 Jillians: Dugans: ESPN Zone: Charlies: DRINKER(DNAME:VARCHAR) Number of tuples:5 John: Peter: Donald: Jeremy: Clark: SELLS(BAR:VARCHAR,BEER:VARCHAR,PRICE:INTEGER) Number of tuples:9 Jillians:Bud:6: Jillians:Michelob:6: Jillians:Heineken:8: Dugans:Bud:9: Dugans:Michelob:10: Dugans:Fosters:12: ESPN Zone:Fosters:9: Charlies:Heineken:10: Charlies:Foster:10: Mac-mini:week2 raj$
and the following output when you run the DuplicatesDriver.py program:
Mac-mini:week2 raj$ python3 DuplicatesDriver.py Before Removing Duplicates: STUDENT(SID:INTEGER,SNAME:VARCHAR) Number of tuples:7 1111:Robert Adams: 1112:Charles Bailey: 1113:Donald James: 1112:Charles Bailey: 1112:Charles Bailey: 1114:Michael James: 1113:Donald James: After Removing Duplicates: STUDENT(SID:INTEGER,SNAME:VARCHAR) Number of tuples:4 1111:Robert Adams: 1112:Charles Bailey: 1113:Donald James: 1114:Michael James:
Pseudo code for initializeDatabase
## Create the database object by reading data from several files in directory dir def initializeDatabase(dir): ## Pseudo Code follows Open file "catalog.dat" Read number of relations in the database for each relation: ## Read relation name and schema information Read relation name Read number of attributes un the relation Create empty array lists for attributes and domain for each attribute: Read attribute name Read attribute domain Add attribute name to attributes array list Add domain to domain array list Create a new Relation object ## Now Read data for tuples, create tuples and add to relation Construct file name for relation data file Open the relation data file Read number of tuples in relation for each tuple: Create new tuple object for each component of tuple: Read component value and add to tuple Add tuple to relation Close relation data file Add relation to database Close catalog.dat file