- 1 Introducing the shell
- 2 Navigating files and directories
- 3 Working with files and directories
- 4 Pipes and filters
- 5 Shell scripts
- 6 Loops
- 7 Finding things
- 8 Shell extras
- 9 Credits
- 10 References
- 11 Data Sources
- Automate basic tasks
- Underlies many other open source languages and applications and can be used to glue them together
- Essential for system administration, remote computing, and high-performance computing
- Many concise special-purpose tools that can make your life easier
- Complements more fully-featured application programming languages
Powerpoint slides: Unix family tree
- Unix-like operation systems share a common architecture and layout
- Roughly compatible, with similar (or identical) shells and tools
- The environment in which most open-source software was written
- Broadly speaking, there is a tension between making computer systems fast and making them easy to use.
- A common solution is to create a 2-layer architecture: A fast, somewhat opaque core surrounded by a more friendly scriptable interface (also referred to as "hooks" or an "API"). Examples of this include video games, Emacs and other highly customizable code editors, and high-level special-purpose languages like Stata and Mathematica.
- Unix shell is the scriptable shell around the operating system. It provides a simple interface for making the operating system do work, without having to know exactly how it accomplishes that work.
The design and terminology of modern computers is based on metaphors from a previous age.
- Files and folders
- Teletype input and output
- Modern touch devices don't expose the file system, so you may be less comfortable with navigating directory trees than people whose primary computing devices were desktop computers
Powerpoint slides: "Navigating files and directories"
whoami- 
Current working directory pwd # Print Working Directory 
- 
By default, this is probably your home directory (discuss how to view this in Finder or File Explorer) - 
Linux /home/nelle
- 
Mac OS /Users/nelle
- 
Windows C:\Users\nelle
 
- 
- 
List the contents of the directory ls # List directory contents
- 
Command flags modify what a command does ls -F # show category markers
ls --help                       # In-line help info; should work in Windows
man ls                          # Manual for "ls"- You can navigate through the man page using the space bar and arrow keys
- Quit man with "q"
- Online references are available for Windows users who don't have man pages: https://linux.die.net/
- 
When a command is followed by an argument, it acts on that argument. ls -F Desktop # get contents of folder ls -F Desktop/shell-lesson-data # get contents of subfolder 
- 
Move down the directory tree cd Desktop cd shell-lesson-data cd exercise-data 
- 
Now that you're "in" a new location, the context for your commands is different pwd ls -F # This produces an error because the folder is in a different location # relative to the working directory cd shell-lesson-data 
- 
Move up the directory tree .is shorthand for "current directory";..is shorthand for "parent directory"# Show hidden files, including current and parent directories ls -a # You can combine flags ls -Fa # Move to parent directory cd .. 
- 
Shortcuts cd ~ # go to home directory cd - # go back to previous directory 
- An absolute path specifies a location from the root of the file system.
- A relative path specifies a location starting from the current location.
- 
See where we are and what we have pwd cd exercise-data/writing # traverse several layers at once ls -F 
- 
Create a directory # Make a subdirectory mkdir thesis ls -F # Make multiple directories; create intermediate dirs as required mkdir -p ../project/data ../project/results # Show all directory contents recursively ls -FR ../project 
- 
Create a text file. Note that everything is available through the file browser and the terminal. cd thesis nano draft.txtThis is my first draft boop beep boop
- 
Edit with Notepad / TextEdit, then re-edit with nano. 
- 
Move our file to a new location cd ~/Desktop/shell-lesson-data/exercise-data/writing # Rename the file by moving it mv thesis/draft.txt thesis/quotes.txt # Verify the new file name ls thesis # You can also specify the exact file name ls thesis/quotes.txt 
- 
Move our file to the current working directory mv thesis/quotes.txt . ls thesis/quotes.txt # Not here anymore ls # now here 
- 
Copy a single file cp quotes.txt thesis/quotations.txt ls thesis ls # Alternatively ls quotes.txt thesis/quotations.txt
- 
Copy a directory recursively cp -r thesis thesis_backup ls thesis thesis_backup 
- 
Remove a file rm quotes.txt ls quotes.txt 
- 
Remove a file interactively Deletion is forever! rm -i thesis_backup/quotations.txt 
- 
Remove a directory and its contents rm thesis # This gives un an error rm -ri thesis # Remove recursively 
Deletion is forever. Consider making a backup archive as part of your workflow.
- 
Create an archive with tar("tape archive").cd ~/Desktop/shell-lesson-data/exercise-data/ # [c]reate a new archive with the given [f]ilename tar -cf writing.tar writing/ 
- 
Create a compressed (zipped) archive. # [a]uto-compress the archive based on its file extension tar -acf writing.zip writing/ # FYI, you may also see tar -a -cf writing.zip writing/ # FYI, linux servers frequently use g[z]ip tar -z -cf writing.tgz writing/ taris an old utility and can be finicky about the order of flags.
- 
Extract your archive mv writing writing_backup # e[x]tract the archive to get the original files back tar -xf writing.zip # Compare the old and restored directories ls writing ls writing_backup 
- 
There are many useful utilities: https://www.gnu.org/software/coreutils/manual/coreutils.html 
- 
Copy with multiple file names cd ~/Desktop/shell-lesson-data/exercise-data/ cp creatures/minotaur.dat creatures/unicorn.dat creatures_backup/ 
- 
Copy using globs ("globals") You can match a single character with ? or unlimited characters with *. This is an example of shell expansion. mkdir proteins_backup # The shell expands *.pdb into the list of all matching files, then does `cp` cp proteins/*.pdb proteins_backup/ 
The "Unix Philosophy" is to combine many small tools that do one job into a processing pipeline.
FYI, .pdb is the Protein Data Bank format
- 
Count words in a file using wccd ~/Desktop/shell-lesson-data/exercise-data/proteins/ ls # Inspect cubane.pdb cat cubane.pdb # [w]ord [c]ount for cubane.pdb wc cubane.pdb 
- 
Run wcfor all files# Run the command with default options wc *.pdb wc -l *.pdb # lines wc -c *.pdb # characters wc -w *.pdb # words 
# Redirect output to file
wc -l *.pdb > lengths.txt
ls lengths.txt
cat lengths.txt       # Inspect contents
head -n 1 lengths.txt # Inspect 1st line
less lengths.txt      # Inspect with pager
- 
The sortcommand runs the file input through a filter and returns the filtered result.sort lengths.txt # alphanumeric sort (i.e. text) sort -n lengths.txt # numeric sort 
- 
Send filtered output to new file sort -n lengths.txt > sorted_lengths.txt cat sorted_lengths.txt
- 
(Optional) Append to the end of a file using >>cd ~/Desktop/shell-lesson-data/exercise-data/animal-counts/ # Create new file head -n 3 animals.csv > animals-subset.csv # Append to that file tail -n 2 animals.csv >> animals-subset.csv 
Pipe output from one command directly into a second command without creating an intermediate file. This is the cornerstone of Unix workflows.
sort -n lengths.txt |  head -n 1Daisy-chain your commands together. As long as the output of command X is a legitimate input for command Y, it will work.
# Return to the beginning
wc -l *.pdb | sort -n
# Add additional commands
wc -l *.pdb | sort -n | head -n 1- 
The terminal saves your command history (typically 500 or 1000 commands) - You can see previous commands using the up/down arrows
- You can edit the command that's currently visible and run it
 
- 
Once your command history gets big, you might want to search it: history # or `history -1000` in zsh on Mac history | grep ls # pipe the output of history into search 
We should save this stuff and reuse it.
- 
Create a new script cd proteins nano middle.sh
- 
Edit the script file and save # Get lines 11-15 head -n 15 octane.pdb | tail -n 5 
- 
Execute the script bash middle.sh 
- 
Use a special variable to run the script on any file ( $1returns the value of a variable;""ensures that it works if there are spaces.)nano middle.sh # Use the 1st argument as your input. head -n 15 "$1" | tail -n 5 bash middle.sh octane.pdb bash middle.sh pentane.pdb 
- 
Use additional ordered arguments nano middle.sh # Select lines from the middle of a file. # Usage: bash middle.sh filename end_line num_lines head -n "$2" "$1" | tail -n "$3" bash middle.sh pentane.pdb 15 5 
- 
Use unlimited arguments nano sorted.sh # Sort files by their length. # Usage: bash sorted.sh one_or_more_filenames wc -l "$@" | sort -n bash sorted.sh *.pdb ../creatures/*.dat 
cd ~/Desktop/shell-lesson-data/exercise-data/animal-counts/
# Get the second column of the CSV
cut -d , -f 2 animals.csv
# Sort the values
cut -d , -f 2 animals.csv | sort
# Get unique values (`uniq` requires values to be adjacent to one another)
cut -d , -f 2 animals.csv | sort | uniq# 1. Run a python script that produces a .csv as output
# 2. Extract the 2nd column of that .csv and get the unique values
python script.py | cut -d , -f 2 | sort | uniqDon't repeat yourself.
cd ~/Desktop/shell-lesson-data/exercise-data/creatures/
nano latin.shfor filename in basilisk.dat minotaur.dat unicorn.dat
do
    # Extract second line of file
    head -n 2 $filename | tail -n 1
donebash latin.shnano latin.shfor filename in *.dat
do
    # Extract second line of file
    head -n 2 $filename | tail -n 1
donebash latin.sh- 
Create a separate directory for your scripts so that you can find them cd ~/Desktop/shell-lesson-data/exercise-data/ mkdir scripts cd scripts nano aggregate.sh 
- 
Write a script that takes arbitrary arguments for filename in "$@" do echo $filename done 
- 
Run the script against the contents of a different directory bash aggregate.sh ../proteins/*.pdb
- 
Do work in the script nano aggregate.sh for filename in "$@" do echo $filename cat $filename >> alkanes.pdb done bash aggregate.sh ../proteins/*.pdb
# List file in long format to show current permissions
ls -l aggregate.sh
# Change file mode (i.e. permissions)
# User can read/write/execute, Group and Other can read
chmod u=rwx,go=r aggregate.sh
# Show changed permissions
ls -l aggregate.sh
# Invoke script
./aggregate.sh ../proteins/*.pdbcd ~/Desktop/shell-lesson-data/exercise-data/
find .# List all directories
find . -type d
# List all files
find . -type f# Do shell expansion, then run command
find . -name *.txt
# Prevent shell expansion and match wildcard
find . -name "*.txt"Grep is a powerful tool for matching text patterns by using regular expressions. You can find introductory documentation for regular expressions in the References section.
Consult the Wooledge Bash Guide (see references below) for more on these topics:
- SSH
- Permissions
- Job control
- Aliases and bash customization
- Shell variables
- Mini-languages (grep, sed, AWK)
- Shell expansion
- Conditional tests
- The Unix Shell: https://swcarpentry.github.io/shell-novice/
- A list of command line utilities: https://ss64.com/bash/
- GNU core utilities: https://www.gnu.org/software/coreutils/manual/coreutils.html
- Bash guide: https://mywiki.wooledge.org/BashGuide
- Shell redirection operators(1): https://www.redhat.com/sysadmin/linux-shell-redirection-pipelining
- Shell redirection operators (2): https://www.gnu.org/software/bash/manual/html_node/Redirections.html
- Grep regular expressions: https://www.gnu.org/software/grep/manual/html_node/Regular-Expressions.html
- Using zsh on MacOS: https://scriptingosx.com/2019/06/moving-to-zsh/