Unix-like OSes
- Linux: led by Linus Torvalds. Does not share kernel code base with original Unixes.
- macOS (since OS X): built on top of another open-source OS called Darwin, derived from FreeBSD Unix. It is POSIX compliant. Most shell commands we review here apply to Mac OS terminal as well. Beware that macOS file system is case-insensitive by default (legacy from Mac OS ver < 10.x.x)
- Windows/DOS, unfortunately, is a totally different breed (Windows 10, though, possesses the Windows Subsystem for Linux.)
Write programs that do one thing and do it well. Write programs to work together and that encourage open standards. Write programs to handle text streams, because that is a universal interface.
- Doug McIlory, the Bell System Technical Journal (1978)
Modularity
- The Unix shell (we’ll learn about this shortly) was designed to allow users to easily build complex workflows by interfacing smaller modular programs together.
wget | awk | grep | sort | uniq | plot
- An alternative approach is to write a single complex program that takes raw data as input, and after hours of data processing, outputs publication figures and a final table of results.
Why Linux
Linux is the most common platform for scientific computing.
Debian/Ubuntu is a popular choice for personal computers.
RHEL/CentOS is popular on servers.
The teaching servers for this class run CentOS 7.
Enter Linux
Source: https://uproxx.com/movies/enter-the-draon-trivia/
On Linux or Mac, access the teaching server by
ssh username@your-teaching-server-name-or-ip-address
- The server addresses and usernames are provided in class.
Windows machines need the PuTTY program (free).
Once you log in, change your password:
passwd
Show distribution/version on Linux:
cat /etc/*-release
CentOS Linux release 7.5.1804 (Core)
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
CentOS Linux release 7.5.1804 (Core)
CentOS Linux release 7.5.1804 (Core)
Show distribution/version on Mac:
sw_vers -productVersion
## 10.12.6
or
system_profiler SPSoftwareDataType
## Software:
##
## System Software Overview:
##
## System Version: macOS 10.12.6 (16G1314)
## Kernel Version: Darwin 16.7.0
## Boot Volume: Macintosh HD
## Boot Mode: Normal
## Computer Name: Joong Ho Won’s Pro
## User Name: Joong-Ho Won (jhwon)
## Secure Virtual Memory: Enabled
## System Integrity Protection: Enabled
## Time since boot: 2 days 9:26
Linux shells
What is a shell?
A shell translates commands to OS instructions (similar to cmd.exe
on Windows).
Most commonly used shells include bash
, csh
, tcsh
, zsh
, etc.
Sometimes a script or a command does not run simply because it’s written for another shell.
We mostly use bash
shell commands in this class.
Determine the current shell:
echo $SHELL
## /bin/bash
List available shells:
cat /etc/shells
## # List of acceptable shells for chpass(1).
## # Ftpd will not allow users to connect who are not using
## # one of these shells.
##
## /bin/bash
## /bin/csh
## /bin/ksh
## /bin/sh
## /bin/tcsh
## /bin/zsh
Bash completion
Bash provides the following standard completion for the Linux users by default. Much less typing errors and time!
Pathname completion.
Filename completion.
Variablename completion: echo $[TAB][TAB]
.
Username completion: cd ~[TAB][TAB]
.
Hostname completion ssh username@[TAB][TAB]
.
It can also be customized to auto-complete other stuff such as options and command’s arguments. Google bash completion
for more information.
Navigating file system
Linux directory structure
- Upon log in, user is at his/her home directory.
Move around the file system
pwd
prints absolute path to the current working directory:
pwd
## /Users/jhwon/Dropbox/class/326.621A/2018/datasci/lectures/02-linux
ls
lists contents of a directory:
ls
## Emacs_Reference_Card.pdf
## Kernighan_Pike.jpg
## Richard_Stallman_2013.png
## Vi_Cheat_Sheet.pdf
## autoSim.R
## enter-the-dragon.jpg
## key_authentication_1.png
## key_authentication_2.png
## linux.html
## linux.nb.html
## linux1.Rmd
## linux1.html
## linux2.Rmd
## linux2.html
## linux_directory_structure.png
## linux_filepermission.png
## linux_filepermission_oct.png
## meanEst.R
## runSim.R
## screenshot_top.png
ls -l
lists detailed contents of a directory:
ls -l
## total 21440
## -rw-r--r--@ 1 jhwon staff 110345 Aug 21 12:42 Emacs_Reference_Card.pdf
## -rw-r--r--@ 1 jhwon staff 146317 Sep 2 21:54 Kernighan_Pike.jpg
## -rw-r--r--@ 1 jhwon staff 141962 Aug 21 12:42 Richard_Stallman_2013.png
## -rw-r--r--@ 1 jhwon staff 199492 Aug 21 12:42 Vi_Cheat_Sheet.pdf
## -rw-r--r--@ 1 jhwon staff 263 Aug 21 12:42 autoSim.R
## -rw-r--r--@ 1 jhwon staff 122738 Sep 2 22:01 enter-the-dragon.jpg
## -rw-r--r--@ 1 jhwon staff 321281 Aug 21 12:42 key_authentication_1.png
## -rw-r--r--@ 1 jhwon staff 96119 Aug 21 12:42 key_authentication_2.png
## -rw-r--r--@ 1 jhwon staff 2650052 Sep 2 18:59 linux.html
## -rw-r--r--@ 1 jhwon staff 2848834 Aug 22 10:14 linux.nb.html
## -rw-r--r--@ 1 jhwon staff 8955 Sep 3 00:54 linux1.Rmd
## -rw-r--r--@ 1 jhwon staff 1321978 Sep 3 00:48 linux1.html
## -rw-r--r--@ 1 jhwon staff 15069 Sep 3 00:42 linux2.Rmd
## -rw-r--r--@ 1 jhwon staff 2427821 Sep 3 00:42 linux2.html
## -rw-r--r--@ 1 jhwon staff 11662 Aug 21 12:42 linux_directory_structure.png
## -rw-r--r--@ 1 jhwon staff 102188 Aug 21 12:42 linux_filepermission.png
## -rw-r--r--@ 1 jhwon staff 42472 Aug 21 12:42 linux_filepermission_oct.png
## -rw-r--r--@ 1 jhwon staff 381 Aug 21 12:42 meanEst.R
## -rw-r--r--@ 1 jhwon staff 498 Aug 21 12:42 runSim.R
## -rw-r--r--@ 1 jhwon staff 373194 Sep 2 16:28 screenshot_top.png
ls -al
lists all contents of a directory, including those start with .
(hidden folders):
ls -al
## total 21448
## drwxr-xr-x@ 23 jhwon staff 782 Sep 3 00:48 .
## drwxr-xr-x@ 20 jhwon staff 680 Aug 22 06:29 ..
## -rw-r--r--@ 1 jhwon staff 4076 Sep 3 00:42 .RData
## -rw-r--r--@ 1 jhwon staff 110345 Aug 21 12:42 Emacs_Reference_Card.pdf
## -rw-r--r--@ 1 jhwon staff 146317 Sep 2 21:54 Kernighan_Pike.jpg
## -rw-r--r--@ 1 jhwon staff 141962 Aug 21 12:42 Richard_Stallman_2013.png
## -rw-r--r--@ 1 jhwon staff 199492 Aug 21 12:42 Vi_Cheat_Sheet.pdf
## -rw-r--r--@ 1 jhwon staff 263 Aug 21 12:42 autoSim.R
## -rw-r--r--@ 1 jhwon staff 122738 Sep 2 22:01 enter-the-dragon.jpg
## -rw-r--r--@ 1 jhwon staff 321281 Aug 21 12:42 key_authentication_1.png
## -rw-r--r--@ 1 jhwon staff 96119 Aug 21 12:42 key_authentication_2.png
## -rw-r--r--@ 1 jhwon staff 2650052 Sep 2 18:59 linux.html
## -rw-r--r--@ 1 jhwon staff 2848834 Aug 22 10:14 linux.nb.html
## -rw-r--r--@ 1 jhwon staff 8955 Sep 3 00:54 linux1.Rmd
## -rw-r--r--@ 1 jhwon staff 1321978 Sep 3 00:48 linux1.html
## -rw-r--r--@ 1 jhwon staff 15069 Sep 3 00:42 linux2.Rmd
## -rw-r--r--@ 1 jhwon staff 2427821 Sep 3 00:42 linux2.html
## -rw-r--r--@ 1 jhwon staff 11662 Aug 21 12:42 linux_directory_structure.png
## -rw-r--r--@ 1 jhwon staff 102188 Aug 21 12:42 linux_filepermission.png
## -rw-r--r--@ 1 jhwon staff 42472 Aug 21 12:42 linux_filepermission_oct.png
## -rw-r--r--@ 1 jhwon staff 381 Aug 21 12:42 meanEst.R
## -rw-r--r--@ 1 jhwon staff 498 Aug 21 12:42 runSim.R
## -rw-r--r--@ 1 jhwon staff 373194 Sep 2 16:28 screenshot_top.png
..
denotes the parent of current working directory.
.
denotes the current working directory.
~
denotes user’s home directory.
/
denotes the root directory.
cd ..
changes to parent directory.
cd
or cd ~
changes to home directory.
cd /
changes to root directory.
File permissions
chmod g+x file
makes a file executable to group members.
chmod 751 file
sets permission rwxr-x--x
to a file.
groups userid
shows which group(s) a user belongs to:
groups jhwon
## staff everyone localaccounts _appserverusr admin _appserveradm com.apple.access_ssh _appstore _lpadmin _lpoperator _developer com.apple.access_ftp com.apple.access_screensharing
Manipulate files and directories
cp
copies file to a new location.
mv
moves file to a new location.
touch
creates a text file; if file already exists, it’s left unchanged.
rm
deletes a file.
mkdir
creates a new directory.
rmdir
deletes an empty directory.
rm -rf
deletes a directory and all contents in that directory (be cautious using the -f
option …).
Find files
find
is similar to locate
but has more functionalities, e.g., select files by age, size, permissions, …. , and is ubiquitous.
find linux1.Rmd
## linux1.Rmd
or
find ~ -name linux1.Rmd
Wildcard characters
? |
any single character |
* |
any character 0 or more times |
+ |
one or more preceding pattern |
^ |
beginning of the line |
$ |
end of the line |
[set] |
any character in set |
[!set] |
any character not in set |
[a-z] |
any lowercase letter |
[0-9] |
any number (same as [0123456789] ) |
# all png files in current folder
ls -l *.png
## -rw-r--r--@ 1 jhwon staff 141962 Aug 21 12:42 Richard_Stallman_2013.png
## -rw-r--r--@ 1 jhwon staff 321281 Aug 21 12:42 key_authentication_1.png
## -rw-r--r--@ 1 jhwon staff 96119 Aug 21 12:42 key_authentication_2.png
## -rw-r--r--@ 1 jhwon staff 11662 Aug 21 12:42 linux_directory_structure.png
## -rw-r--r--@ 1 jhwon staff 102188 Aug 21 12:42 linux_filepermission.png
## -rw-r--r--@ 1 jhwon staff 42472 Aug 21 12:42 linux_filepermission_oct.png
## -rw-r--r--@ 1 jhwon staff 373194 Sep 2 16:28 screenshot_top.png
Regular expression
Wildcards are examples of regular expressions.
Regular expressions are a powerful tool to efficiently sift through large amounts of text: record linking, data cleaning, scraping data from website or other data-feed.
- Simple regular expressions
^
: start of the string
$
: end of the string
?
: 0 or 1 repetition
+
: 1 or more repetitions
*
: 0 or more repetitions
Google regular expressions
to learn.