1. Introduction and flat files
1.1 Welcome to the course!
1.2 Exploring your working directory
In order to import data into Python, you should first have an idea of what files are in your working directory.
IPython, which is running on DataCamp’s servers, has a bunch of cool commands, including its magic commands. For example, starting a line with !
gives you complete system shell access. This means that the IPython magic command ! ls
will display the contents of your current directory. Your task is to use the IPython magic command ! ls
to check out the contents of your current directory and answer the following question: which of the following files is in your working directory?
□ \square □ huck_finn.txt
□ \square □ titanic.csv
■ \blacksquare ■ moby_dick.txt
1.3 Importing entire text files
In this exercise, you’ll be working with the file moby_dick.txt
. It is a text file that contains the opening sentences of Moby Dick, one of the great American novels! Here you’ll get experience opening a text file, printing its contents to the shell and, finally, closing it.
Instruction
- Open the file
moby_dick.txt
as read-only and store it in the variablefile
. Make sure to pass the filename enclosed in quotation marks''
. - Print the contents of the file to the shell using the
print()
function. As Hugo showed in the video, you’ll need to apply the methodread()
to the objectfile
. - Check whether the file is closed by executing
print(file.closed)
. - Close the file using the
close()
method. - Check again that the file is closed as you did above
在这里插入代码片
1.4 Importing text files by lines
For large files, we may not want to print all of their content to the shell: you may wish to print only the first few lines. Enter the readline()
method, which allows you to do this. When a file called file
is open, you can print out the first line by executing ile.readline()
. If you execute the same command again, the second line will print, and so on.
In the introductory video, Hugo also introduced the concept of a context manager. He showed that you can bind a variable file
by using a context manager construct:
with open('huck_finn.txt') as file:
While still within this construct, the variable file
will be bound to open('huck_finn.txt')
; thus, to print the file to the shell, all the code you need to execute is:
with open('huck_finn.txt') as file: print(file.readline())
You’ll now use these tools to print the first few lines of moby_dick.txt
!
Instruction
- Open
moby_dick.txt
using thewith
context manager and the variablefile
. - Print the first three lines of the file to the shell by using
readline()
three times within the context manager.
在这里插入代码片
1.5 The Importance of flat files in data science
1.6 Pop quiz: examples of flat files?
You’re now well-versed in importing text files and you’re about to become a wiz at importing flat files. But can you remember exactly what a flat file is? Test your knowledge by answering the following question: which of these file types below is NOT an example of a flat file?
□
\square
□ A .csv file.
□
\square
□ A tab-delimited .txt.
■
\blacksquare
■ A relational database (e.g. PostgreSQL).
1.7 Pop quiz: what exactly are flat files?
Which of the following statements about flat files is incorrect?
□ \square □ Flat files consist of rows and each row is called a record.
■ \blacksquare ■ Flat files consist of multiple tables with structured relationships between the tables.
□ \square □ A record in a flat file is composed of fields or attributes, each of which contains at most one item of information.
□ \square □ Flat files are pervasive in data science.