Unix ja shell-ohjelmointi 2001, demo 4 (14.2.2001)

  1. There are names in a file, one person per line, first names first, last name last, separated with spaces. How could you search for

    1. two-part last names (Bulwer-Lytton etc)?
    2. two-part (Ville-Petteri)?
    3. last names with a capital letter somewhere other than the beginning (O'Neil, MacIntosh)?
    4. names (first or last) that begin with a lower-case letter (d'Arnot)?
    5. last names with separate prefix ''von'', ''af'', ''van'', ''van der'', ''van den'', ''von der'' (Juhan af Grann, Mies van der Rohe)?
    6. people whose last name does not end -nen, -la or -l?
    7. people whose first names are not in file etunimet.txt?

  2. How could you

    1. encrypt a file with Caesar's method, where each letter is replaced with the next in alphabet (and the last with a), and decrypt so encrypted file?
    2. on every line of a file swap adjacent words (leaving possible odd word at the end untouched), e.g.,
      "John Doe is a bit looney." ->
      "Doe John a is looney bit." ?
    3. insert a column of zeroes to a file containing integers separated with commas, between columns 3 and 4?
    4. check that a string contains exactly 12 hexadecimal numbers (0-9, A-F, a-f), either with no separators, separated with colons between every two (12:34:56:78:90:AB) or split in the middle with a dash (123456-7890AB) but no other extra characters?
    5. clean up hex numbers like in the above problem: remove colons and dashes and convert a-f to upper case?
    6. split a file containing Finnish social security numbers (one per line) in three files 18XX, 19XX and 20XX according to birth century? (SSN format is DDMMYYXNNNC, where X represents the century: + for 18xx, - for 19xx and A for 20xx.)

    1. How could you collect all HREF tags from an HTML file (format <a href="...">), printing only them one per line without any surrounding text?
    2. As above but print also the reference text, i.e., all until following </a>.

  3. The right way to represent, e.g., scandinavian characters in an HTML document is &Auml;= etc. Let file htmlspecials have a list of such special characters and their HTML forms like this:

    character notation

    Write scripts that use this file to convert special characters in a file into HTML format and back.

  4. The header of an email message looks like this:
    From Hewlett-Packard@hpcc963.external.hp.com  Thu Apr  6 12:36:23 2000
    Return-Path: <Hewlett-Packard@hpcc963.external.hp.com>
    Received: from hpcc963.external.hp.com (hpcc963.external.hp.com [])
            by jane.math.jyu.fi (8.8.6 (PHNE_14041)/8.8.6) with ESMTP id MAA16086
            for <tt@mit.jyu.fi>; Thu, 6 Apr 2000 12:36:22 +0300 (EETDST)
    From: Hewlett-Packard@hpcc963.external.hp.com
    Received: (from root@localhost)
            by hpcc963.external.hp.com (8.8.6 (PHNE_17135)/8.8.5tis) id FAA12651;
            Thu, 6 Apr 2000 05:36:19 -0400 (EDT)
    Date: Thu, 6 Apr 2000 05:36:19 -0400 (EDT)
    Message-Id: <200004060936.FAA12651@hpcc963.external.hp.com>
    To: tt@mit.jyu.fi
    Subject: Extension Software Patch Bundles Software License
    Status: RO
    Content-Length: 1070
    Lines: 42
    The first header line is always "From " (capital F, trailing space), multiline header fields are indicated by indenting continuation lines and the header is separated from the message body by a blank line. A mail folder can have any number of messages, new message can be detected by the initial "From " line.

    Write a script that processes a mail folder, checks every message to see if its subject includes the word "kalakukko" and if so, writes the message (with header) to file $HOME/censored, otherwise inserts a header line
    X-note: approved by censorship board
    and prints the message (with header) to stdin.

    1. Write a script that looks for palindromes in stdin (words that read the same reversed, e.g., ''level'' - not multi-word ones). As an argument the script should accept the desired palindrome length, either exact or range (n-m ). You can assume a maximum length of 19 characters.

    2. Assume a file has palindromes of various lengths, one per line. How could you sort them in alphabetical order using as key their second half (starting from the middle character if the length is odd, after the midpoint if even)? (E.g., 'abba' < radar < racecar < deified < aibohphobia < kayak.)

File translated from TEX by TTH, version 1.98.
On 8 Feb 2001, 18:32.