1-DAV-202 Data Management 2023/24
Previously 2-INF-185 Data Source Integration

Materials · Introduction · Rules · Contact
· Grades from marked homeworks are on the server in file /grades/userid.txt
· Dates of project submission and oral exams:
Early: submit project May 24 9:00am, oral exams May 27 1:00pm (limit 5 students).
Otherwise submit project June 11, 9:00am, oral exams June 18 and 21 (estimated 9:00am-1:00pm, schedule will be published before exam).
Sign up for one the exam days in AIS before June 11.
Remedial exams will take place in the last week of the exam period. Beware, there will not be much time to prepare a better project. Projects should be submitted as homeworks to /submit/project.
· Cloud homework is due on May 20 9:00am.


Difference between revisions of "Command-line basics"

From MAD
Jump to navigation Jump to search
 
(21 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Files and folders==
+
This a brief tutorial for students who are not familiar with Linux command-line.  
* Images, texts, data, etc. are stored in files
 
* Files are grouped in folders (directories) for better organization
 
* A folder can also contain other folders, forming a tree structure
 
  
===Moving around folders (ls, cd) ===
+
== Files and folders ==
 +
* Images, texts, data, etc. are stored in files.
 +
* Files are grouped in folders (directories) for better organization.
 +
* A folder can also contain other folders, forming a tree structure.
 +
 
 +
=== Moving around folders (ls, cd) ===
 
* One folder is always selected as the current one; it is shown on the command line
 
* One folder is always selected as the current one; it is shown on the command line
 
* The list of files and folders in the current folder can be obtained with the <tt>ls</tt> command
 
* The list of files and folders in the current folder can be obtained with the <tt>ls</tt> command
Line 17: Line 19:
 
* Using <tt>ls</tt> command, we print all files in the <tt>/tasks/perl/</tt> folder.  
 
* Using <tt>ls</tt> command, we print all files in the <tt>/tasks/perl/</tt> folder.  
 
* Finally we use <tt>ls /tasks</tt> command to print the folders in /tasks
 
* Finally we use <tt>ls /tasks</tt> command to print the folders in /tasks
<pre>
+
<syntaxhighlight lang="bash">
 
username@vyuka:~$ cd /tasks/perl/
 
username@vyuka:~$ cd /tasks/perl/
 
username@vyuka:/tasks/perl$ ls
 
username@vyuka:/tasks/perl$ ls
Line 25: Line 27:
 
username@vyuka:/tasks/perl$ ls /tasks
 
username@vyuka:/tasks/perl$ ls /tasks
 
bash  bioinf1  bioinf2  bioinf3  cloud  flask  make  perl  python  r1  r2
 
bash  bioinf1  bioinf2  bioinf3  cloud  flask  make  perl  python  r1  r2
</pre>
+
</syntaxhighlight>
  
 
===Absolute a relative paths===
 
===Absolute a relative paths===
* '''Absolute path''' determines how to get to a given file or folder from the '''root''' of the whole filesystem
+
* '''Absolute path''' determines how to get to a given file or folder from the '''root''' of the whole filesystem.
 
* For example <tt>/tasks/perl/</tt>, <tt>/tasks/perl/series.tsv</tt>,  <tt>/home/username</tt> etc.  
 
* For example <tt>/tasks/perl/</tt>, <tt>/tasks/perl/series.tsv</tt>,  <tt>/home/username</tt> etc.  
* Individual folders are separated by a slash <tt>/</tt> in the path
+
* Individual folders are separated by a slash <tt>/</tt> in the path.
* Absolute paths start with a slash  
+
* Absolute paths start with a slash <tt>/</tt>.
  
* '''Relative path''' determines how to get to a given file or folder from the current folder
+
* '''Relative path''' determines how to get to a given file or folder from the current folder.
 
* For example, if the current folder is <tt>/tasks/perl/</tt>, the relative path to file <tt>/tasks/perl/series.tsv</tt> is simply <tt>series.tsv</tt>
 
* For example, if the current folder is <tt>/tasks/perl/</tt>, the relative path to file <tt>/tasks/perl/series.tsv</tt> is simply <tt>series.tsv</tt>
 
* If the current folder is <tt>/tasks/</tt>, the relative path to file <tt>/tasks/perl/series.tsv</tt> is <tt>perl/series.tsv</tt>
 
* If the current folder is <tt>/tasks/</tt>, the relative path to file <tt>/tasks/perl/series.tsv</tt> is <tt>perl/series.tsv</tt>
* Relative paths do not start with a slash
+
* Relative paths do not start with a slash.
 
* A relative path can also go "upwards" to the containing folder using <tt>..</tt>
 
* A relative path can also go "upwards" to the containing folder using <tt>..</tt>
 
* For example, if the current folder is <tt>/tasks/perl/</tt>, the relative path <tt>..</tt> will give us the same as <tt>/tasks</tt> and <tt>../../home</tt> will give us <tt>/home</tt>
 
* For example, if the current folder is <tt>/tasks/perl/</tt>, the relative path <tt>..</tt> will give us the same as <tt>/tasks</tt> and <tt>../../home</tt> will give us <tt>/home</tt>
Line 43: Line 45:
  
 
===Important folders===
 
===Important folders===
* '''Root''' is the folder with absolute path <tt>/</tt>, the starting point of the tree structure of folders
+
* '''Root''' is the folder with absolute path <tt>/</tt>, the starting point of the tree structure of folders.
* '''Home directory''' with absolute path <tt>/home/username</tt> is set as the current folder after login
+
* '''Home directory''' with absolute path <tt>/home/username</tt> is set as the current folder after login.
** Users typically store most of their files within their home directory and its subfolders, if there is no good reason to place them elsewhere
+
** Users typically store most of their files within their home directory and its subfolders, if there is no good reason to place them elsewhere.
** Tilde <tt>~</tt> is an abbreviation for your home directory. For example <tt>cd ~</tt> will place you there
+
** Tilde <tt>~</tt> is an abbreviation for your home directory. For example <tt>cd ~</tt> will place you there.
  
 
===Wildcards===
 
===Wildcards===
* We can use wildcards to work with only selected files in a folder
+
* We can use wildcards to work with only selected files in a folder.
 
* For example, all files starting with letter x in the current folder can be printed as <tt>ls x*</tt>
 
* For example, all files starting with letter x in the current folder can be printed as <tt>ls x*</tt>
* The star represents any number of characters in the filename
+
* The star represents any number of characters in the filename.
 
* All files containing letter x anywhere in the name can be printed as <tt>ls *x*</tt>
 
* All files containing letter x anywhere in the name can be printed as <tt>ls *x*</tt>
  
===Prezeranie obsahu súboru (less)===
+
===Examining file content (less)===
* <tt>less subor</tt>
+
* Type <tt>less filename</tt>
* Vypíše súbor na obrazovku, môžeme v ňom listovať pomocou medzery alebo Page up a Page down, vyskočíme pomocou q (ako quit), ďalšie klávesy sa dozviete po stlačení h (help)
+
* This will print the first page of the file on your screen. Then you can move around the file using space or keys Page up and Page down. You can quit the viewer pressing letter <tt>q</tt> (abbreviation of quit). Help with additional keys is accessed by pressing <tt>h</tt> (abbreviation of help).
 +
* If you have a short file and want to just print it all on your screen, use <tt>cat filename</tt>
 +
* Try for example the following commands:
 +
<syntaxhighlight lang="bash">
 +
less /tasks/perl/reads-small.fastq  # move around the file, then press q
 +
cat /tasks/perl/reads-tiny.fasta    # see the whole file on the screen
 +
</syntaxhighlight>
 +
 
 +
==Organizing files and folders==
 +
 
 +
===Creating new folders (mkdir)===
 +
* To create a new folder (directory), use a command of the form <tt>mkdir new_folder_path</tt>
 +
* The path to the new folder can be relative or absolute
 +
* For example, assume that we are in the home folder, the following two commands both create a new folder called test and folder test2 within it.
 +
<syntaxhighlight lang="bash">
 +
mkdir test
 +
# change the next command according to your username
 +
mkdir /home/username/test/test2 
 +
</syntaxhighlight>
  
===Kopírovanie súborov (cp)===
+
===Copying files (cp)===
* <tt>cp odkiaľ kam</tt>
+
* To copy files, use a command of the form <tt>cp source destination</tt>
* Skopíruje súbor <tt>odkiaľ</tt> na miesto <tt>kam</tt>
+
* This will copy file specified as the source to the destination.
* Môžeme použiť absolútne alebo relatívne cesty.
+
* Both source and destination can be specified via absolute or relative paths.
* Cieľové miesto <tt>kam</tt> môže byť adresár alebo celé meno súboru
+
* The destination can be a folder into which the file is copied or an entire name of the copied file.
 +
* We can also copy several files to the same folder as follows: <tt>cp file1 file2 file3 destination</tt>
  
'''Príklad:''' ak aktuálny adresár je /projects/data-ppb/, nasledujúce tri príkazy všetky kopírujú súbor README do poadresára events:
+
'''Example:''' Let us assume that the current directory is <tt>/home/username</tt> and that directories <tt>test</tt> and <tt>test2</tt> were created as above. The following will copy file <tt>/tasks/perl/reads-small.fastq</tt> to the new directory test and afterwards also to the current folder (which is the home directory). In the third step it will be copied again to the current folder under a new name <tt>test.fastq</tt>. In the final steps it will be copied to the <tt>test</tt> directory under this new name as well.
<pre>
+
<syntaxhighlight lang="bash">
# relatívne cesty
+
# change the next command according to your username
cp README events/
+
# this copies an existing file to the home directory using absolute paths
# absolútne cesty
+
cp /tasks/perl/reads-small.fastq /home/username/
cp /projects/data-ppb/README /projects/data-ppb/events/
+
# now we use relative paths to copy the file from home to the new folder called test
# celé meno súboru
+
cp reads-small.fastq test/
cp README events/README
+
# now the file is copied within current folder under a new filename test.fastq
 +
cp reads-small.fastq test.fastq
 +
# change directory to test
 +
cd test
 +
# copy the file again from the home directory to the test directory under name test.fastq
 +
cp ../test.fastq .
 +
# we can copy several files to a different folder
 +
cp test.fastq reads-small.fastq test2/
 +
</syntaxhighlight>
  
# tento príkaz kopíruje do /projects/data-ppb/events/README2
+
===Other file-related commands (cp -r, mv, rm, rmdir) ===
cp README events/README2
+
* Copying whole folders can be done via <tt>cp -r source destination</tt>
# ak sa presunieme do adresára events, môžeme kopírovať súbor do aktuálneho adresára .
+
* While using <tt>cp</tt>, it is good to add <tt>-i</tt> option which will warn us in case we are going to overwrite some existing file. For example:
cd events
+
<syntaxhighlight lang="bash">
cp ../README .
+
cd ~
</pre>
+
cp -i reads-small.fastq test/
 +
</syntaxhighlight>
 +
* To move files to a new folder or rename them, you can use <tt>mv</tt> command, which works similarly to <tt>cp</tt>, i.e. you specify first source, then destination. Option -i can be used here as well.
 +
* Command <tt>rm</tt> will delete specified files, <tt>rm -r</tt> whole folders (be very careful!).
 +
* An empty folder can be deleted using <tt>rmdir</tt>
  
===Kopírovanie súborov zo servera/na server (scp)===
+
==Beware: be very careful on the command-line==
* Kríženec medzi ssh a cp (Secure CoPy)
+
* The command-line will execute whatever you type, it generally does not ask for confirmation, even for dangerous actions.
* Na server (spustite na vašom linuxovom počítači napr. v učebni): <tt>scp subor meno@vyuka.compbio.fmph.uniba.sk:nove_meno_suboru</tt>
+
* You can very easily remove or overwrite some important file by mistake.
* Zo servera: <tt>scp meno@vyuka.compbio.fmph.uniba.sk:subor nove_meno_suboru</tt>
+
* There is no undo.
* Ak chcete kopírovať súbory medzi serverom a Windowsovým počítačom, nainštalujte si program [[Softvér na stiahnutie|WinSCP]]
+
* Therefore always check your command before pressing Enter. Use <tt>-i</tt> option for <tt>cp</tt>, <tt>mv</tt>, and possibly even <tt>rm</tt>.
  
<pre>
+
== File permissions and other properties==
#skopíruje súbor README2 do adresára /projects/data-ppb/ na serveri
 
scp README2 hrasko37@vyuka.compbio.fmph.uniba.sk:/projects/data-ppb/
 
  
#skopíruje súbor README2 do domovského adresára užívateľa hrasko37 na serveri
+
* Linux gives us control over which files we share and which we keep private.
scp README2 hrasko37@vyuka.compbio.fmph.uniba.sk:
+
* Each file (and folder) has its owner (usually the user who created it) and a group of users it is assigned to.
 +
* It has three level of permissions: for its owner (abbreviated u, as user), for its group (g) and for every user on the system (o, as other).
 +
* At each level, we can grant the right to read the file (abbreviated r), to write or modify it (abbreviated w) and to execute the file (abbreviated x).
 +
* Permission to execute is important for executable programs but also for folders.
  
#skopíruje súbor README2 do domovského adresára užívateľa hrasko37 pod menom README3
+
=== The long form of ls command ===
scp README2 hrasko37@vyuka.compbio.fmph.uniba.sk:README3
 
  
#skopíruje súbor README2 z domovského adresára užívateľa hrasko37 na serveri do aktuálneho adresára
+
* The <tt>ls</tt> command can be run with switch <tt>-l</tt> to produce a more detailed information about files
scp hrasko37@vyuka.compbio.fmph.uniba.sk:README2 .
+
* Here are two lines from the output after running  <tt>ls -l /tasks/</tt>
</pre>
+
<syntaxhighlight lang="bash">
 +
drwxr-xr-x 2 bbrejova users    4096 Feb 11 22:43 perl
 +
drwxrwxr-x 5 bbrejova teacher  4096 Mar  7  2022 make
 +
</syntaxhighlight>
 +
* The very first character of each line is <tt>d</tt> for folders (directories) and <tt>-</tt> for regular files. Here we see two folders.
 +
* The next three characters give permissions for the user, the next three for the group and the next three of other users. For example folder <tt>perl</tt> has all three permissions <tt>rwx</tt> for the user, and all permissions except writing for group and others.
 +
* Column 3 and 4 of the output list the owner and the group assigned to the file.  
 +
* Column 5 lists the size (by default in bytes; this can be changed to human-readable sizes, such as gigabytes, using <tt>-h</tt> switch)
 +
* Finally the line contains the date of the last modification of the file and its name.
  
==Poznámky==
+
=== Changing file permissions ===
===Upozornenie: dvakrát meraj, raz rež===
 
* Príkazový riadok spraví, čo napíšete, nepýta sa, či to myslíte naozaj
 
* Príkazy cp a scp môžu prepísať už existujúce súbory
 
* Neexistuje undo
 
* Preto si vždy dobre premyslite, čo chcete spraviť a skontrolujte príkaz pred tým, ako dáte Enter
 
  
===Zjednodušenie práce===
+
* File permissions can be changed using <tt>chmod</tt> command.
Užitočné pomôcky na príkazovom riadku:
+
* It is followed by change specification in the form of <tt>[who][+ or -][which rights]</tt>
* kláves Tab
+
* Part <tt>[who]</tt> can be <tt>o</tt> (others), <tt>g</tt> (group), <tt>u</tt> (user), <tt>a</tt> (all = user+group+others)
** ak je len jeden súbor alebo adresár, ktorý pasuje na rozpísaný začiatok slova, doplní ho automaticky
+
* Sign <tt>+</tt> means adding rights, sign <tt>-</tt> means removing them
** ak je súborov alebo adresárov viac, doplní, čo majú spoločné,po opakovanom stlačení ponúkne možnosti
+
* Rights can be <tt>w</tt> (write), <tt>r</tt> (read), <tt>x</tt> (execute)
* šípky hore/dole
+
* There are also many other possibilities, see [https://www.gnu.org/software/coreutils/manual/html_node/chmod-invocation.html documentation]
** prechádzanie históriou spustených príkazov
+
<syntaxhighlight lang="bash">
* copy&paste myšou
+
# add read permission to others for file protocol.txt:
** ľavým tlačidlom a ťahaním po texte označíme
+
chmod o+r protocol.txt
** kliknutím stredného tlačidla (kolieska) vložíme kam potrebujeme
+
# remove write permissions from others and group for file protocol.txt:
** ak nemáme stredné tlačidlo, klikneme naraz pravým aj ľavým
+
chmod og-w protocol.txt
 +
# add read permissions to everybody for whole folder "data" and its files and subfolders
 +
chmod -r a+r data
 +
</syntaxhighlight>
  
===See also===
+
==See also==
 
* [http://tldp.org/LDP/gs/node5.html Linux tutorial]
 
* [http://tldp.org/LDP/gs/node5.html Linux tutorial]

Latest revision as of 20:38, 21 February 2024

This a brief tutorial for students who are not familiar with Linux command-line.

Files and folders

  • Images, texts, data, etc. are stored in files.
  • Files are grouped in folders (directories) for better organization.
  • A folder can also contain other folders, forming a tree structure.

Moving around folders (ls, cd)

  • One folder is always selected as the current one; it is shown on the command line
  • The list of files and folders in the current folder can be obtained with the ls command
  • The list of files in some other folder can be obtained with the command ls other_folder
  • The command cd new_folder changes the current folder to the specified new folder
  • Notes: ls is an abbreviation of "list", cd is an abbreviation of "change directory"

Example:

  • When we login to the server, we are in the folder /home/username.
  • We then execute several commands listed below
  • Using cd command, we move to folder /tasks/perl/ (the computer does not print anything, only changes the current folder).
  • Using ls command, we print all files in the /tasks/perl/ folder.
  • Finally we use ls /tasks command to print the folders in /tasks
username@vyuka:~$ cd /tasks/perl/
username@vyuka:/tasks/perl$ ls
fastq-lengths.pl  reads-small.fastq  reads-tiny-trim1.fastq  series.tsv
protocol.txt      reads-tiny.fasta   reads-tiny-trim2.fastq
reads.fastq       reads-tiny.fastq   series-small.tsv
username@vyuka:/tasks/perl$ ls /tasks
bash  bioinf1  bioinf2  bioinf3  cloud  flask  make  perl  python  r1  r2

Absolute a relative paths

  • Absolute path determines how to get to a given file or folder from the root of the whole filesystem.
  • For example /tasks/perl/, /tasks/perl/series.tsv, /home/username etc.
  • Individual folders are separated by a slash / in the path.
  • Absolute paths start with a slash /.
  • Relative path determines how to get to a given file or folder from the current folder.
  • For example, if the current folder is /tasks/perl/, the relative path to file /tasks/perl/series.tsv is simply series.tsv
  • If the current folder is /tasks/, the relative path to file /tasks/perl/series.tsv is perl/series.tsv
  • Relative paths do not start with a slash.
  • A relative path can also go "upwards" to the containing folder using ..
  • For example, if the current folder is /tasks/perl/, the relative path .. will give us the same as /tasks and ../../home will give us /home

Commands ls, cd and others accept both relative and absolute paths.

Important folders

  • Root is the folder with absolute path /, the starting point of the tree structure of folders.
  • Home directory with absolute path /home/username is set as the current folder after login.
    • Users typically store most of their files within their home directory and its subfolders, if there is no good reason to place them elsewhere.
    • Tilde ~ is an abbreviation for your home directory. For example cd ~ will place you there.

Wildcards

  • We can use wildcards to work with only selected files in a folder.
  • For example, all files starting with letter x in the current folder can be printed as ls x*
  • The star represents any number of characters in the filename.
  • All files containing letter x anywhere in the name can be printed as ls *x*

Examining file content (less)

  • Type less filename
  • This will print the first page of the file on your screen. Then you can move around the file using space or keys Page up and Page down. You can quit the viewer pressing letter q (abbreviation of quit). Help with additional keys is accessed by pressing h (abbreviation of help).
  • If you have a short file and want to just print it all on your screen, use cat filename
  • Try for example the following commands:
less /tasks/perl/reads-small.fastq  # move around the file, then press q
cat /tasks/perl/reads-tiny.fasta    # see the whole file on the screen

Organizing files and folders

Creating new folders (mkdir)

  • To create a new folder (directory), use a command of the form mkdir new_folder_path
  • The path to the new folder can be relative or absolute
  • For example, assume that we are in the home folder, the following two commands both create a new folder called test and folder test2 within it.
mkdir test
# change the next command according to your username
mkdir /home/username/test/test2

Copying files (cp)

  • To copy files, use a command of the form cp source destination
  • This will copy file specified as the source to the destination.
  • Both source and destination can be specified via absolute or relative paths.
  • The destination can be a folder into which the file is copied or an entire name of the copied file.
  • We can also copy several files to the same folder as follows: cp file1 file2 file3 destination

Example: Let us assume that the current directory is /home/username and that directories test and test2 were created as above. The following will copy file /tasks/perl/reads-small.fastq to the new directory test and afterwards also to the current folder (which is the home directory). In the third step it will be copied again to the current folder under a new name test.fastq. In the final steps it will be copied to the test directory under this new name as well.

# change the next command according to your username
# this copies an existing file to the home directory using absolute paths
cp /tasks/perl/reads-small.fastq /home/username/
# now we use relative paths to copy the file from home to the new folder called test
cp reads-small.fastq test/
# now the file is copied within current folder under a new filename test.fastq
cp reads-small.fastq test.fastq
# change directory to test
cd test
# copy the file again from the home directory to the test directory under name test.fastq
cp ../test.fastq .
# we can copy several files to a different folder
cp test.fastq reads-small.fastq test2/

Other file-related commands (cp -r, mv, rm, rmdir)

  • Copying whole folders can be done via cp -r source destination
  • While using cp, it is good to add -i option which will warn us in case we are going to overwrite some existing file. For example:
cd ~
cp -i reads-small.fastq test/
  • To move files to a new folder or rename them, you can use mv command, which works similarly to cp, i.e. you specify first source, then destination. Option -i can be used here as well.
  • Command rm will delete specified files, rm -r whole folders (be very careful!).
  • An empty folder can be deleted using rmdir

Beware: be very careful on the command-line

  • The command-line will execute whatever you type, it generally does not ask for confirmation, even for dangerous actions.
  • You can very easily remove or overwrite some important file by mistake.
  • There is no undo.
  • Therefore always check your command before pressing Enter. Use -i option for cp, mv, and possibly even rm.

File permissions and other properties

  • Linux gives us control over which files we share and which we keep private.
  • Each file (and folder) has its owner (usually the user who created it) and a group of users it is assigned to.
  • It has three level of permissions: for its owner (abbreviated u, as user), for its group (g) and for every user on the system (o, as other).
  • At each level, we can grant the right to read the file (abbreviated r), to write or modify it (abbreviated w) and to execute the file (abbreviated x).
  • Permission to execute is important for executable programs but also for folders.

The long form of ls command

  • The ls command can be run with switch -l to produce a more detailed information about files
  • Here are two lines from the output after running ls -l /tasks/
drwxr-xr-x 2 bbrejova users    4096 Feb 11 22:43 perl
drwxrwxr-x 5 bbrejova teacher  4096 Mar  7  2022 make
  • The very first character of each line is d for folders (directories) and - for regular files. Here we see two folders.
  • The next three characters give permissions for the user, the next three for the group and the next three of other users. For example folder perl has all three permissions rwx for the user, and all permissions except writing for group and others.
  • Column 3 and 4 of the output list the owner and the group assigned to the file.
  • Column 5 lists the size (by default in bytes; this can be changed to human-readable sizes, such as gigabytes, using -h switch)
  • Finally the line contains the date of the last modification of the file and its name.

Changing file permissions

  • File permissions can be changed using chmod command.
  • It is followed by change specification in the form of [who][+ or -][which rights]
  • Part [who] can be o (others), g (group), u (user), a (all = user+group+others)
  • Sign + means adding rights, sign - means removing them
  • Rights can be w (write), r (read), x (execute)
  • There are also many other possibilities, see documentation
# add read permission to others for file protocol.txt:
chmod o+r protocol.txt
# remove write permissions from others and group for file protocol.txt:
chmod og-w protocol.txt
# add read permissions to everybody for whole folder "data" and its files and subfolders
chmod -r a+r data

See also