Csc 1302, Honors Principles of Computer Science II (Fall 2023)
Week 2 (10 September 2023)
Initialize Database from Files; Check Equality of Tuples; Remove Duplicates in Relation
During this week you will write the following methods:- Initialize Database object by reading data stored in several files
in a directory that is given in command line (this method belongs to Database.py)
# Create the database object by reading data from several files in directory dir def initializeDatabase(self, dir): pass
Here is a sample of files available in the directory: drinks (also in .zip format: drinks.zip). The file catalog.dat contains schema information for all the relations in the database and individual .dat files contain the relation instances (tuples).To read data from a file, you can use the following code:
f = open("f.dat","r") s = f.readLine().strip("\n") ... ... f.close() - Test equality of tuples (this method belongs to Tuple.py)
# Return True if this tuple is equal to compareTuple; False otherwise # make sure the schemas are the same; return False if schema's are not same def equals(self, compareTuple): pass
- Remove duplicate tuples (this method belongs to Relation.py)
## Remove duplicate tuples from this relation def removeDuplicates(self): pass
Download the Driver Programs and implement all the methods in the Python classes. Compile and run the driver programs.
You should see the following output when you run Driver2.py:
Mac-mini:week2 raj$ python3 Driver2.py drinks BAR(BNAME:VARCHAR) Number of tuples:4 Jillians: Dugans: ESPN Zone: Charlies: DRINKER(DNAME:VARCHAR) Number of tuples:5 John: Peter: Donald: Jeremy: Clark: SELLS(BAR:VARCHAR,BEER:VARCHAR,PRICE:INTEGER) Number of tuples:9 Jillians:Bud:6: Jillians:Michelob:6: Jillians:Heineken:8: Dugans:Bud:9: Dugans:Michelob:10: Dugans:Fosters:12: ESPN Zone:Fosters:9: Charlies:Heineken:10: Charlies:Foster:10: Mac-mini:week2 raj$
and the following output when you run the DuplicatesDriver.py program:
Mac-mini:week2 raj$ python3 DuplicatesDriver.py Before Removing Duplicates: STUDENT(SID:INTEGER,SNAME:VARCHAR) Number of tuples:7 1111:Robert Adams: 1112:Charles Bailey: 1113:Donald James: 1112:Charles Bailey: 1112:Charles Bailey: 1114:Michael James: 1113:Donald James: After Removing Duplicates: STUDENT(SID:INTEGER,SNAME:VARCHAR) Number of tuples:4 1111:Robert Adams: 1112:Charles Bailey: 1113:Donald James: 1114:Michael James:
Pseudo code for initializeDatabase
## Create the database object by reading data from several files in directory dir
def initializeDatabase(dir):
## Pseudo Code follows
Open file "catalog.dat"
Read number of relations in the database
for each relation:
## Read relation name and schema information
Read relation name
Read number of attributes un the relation
Create empty array lists for attributes and domain
for each attribute:
Read attribute name
Read attribute domain
Add attribute name to attributes array list
Add domain to domain array list
Create a new Relation object
## Now Read data for tuples, create tuples and add to relation
Construct file name for relation data file
Open the relation data file
Read number of tuples in relation
for each tuple:
Create new tuple object
for each component of tuple:
Read component value and add to tuple
Add tuple to relation
Close relation data file
Add relation to database
Close catalog.dat file