[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

77. stringproc


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

77.1 Introduction to string processing

stringproc.lisp enlarges Maximas capabilities of working with strings and adds some useful functions for file in/output.

For questions and bugs please mail to volkervannek at gmail dot com .

In Maxima a string is easily constructed by typing "text". stringp tests for strings.

(%i1) m: "text";
(%o1)                         text
(%i2) stringp(m);
(%o2)                         true

Characters are represented as strings of length 1. These are not Lisp characters. Tests can be done with charp (respectively lcharp and conversion from Lisp to Maxima characters with cunlisp).

(%i1) c: "e";
(%o1)                           e
(%i2) [charp(c),lcharp(c)];
(%o2)                     [true, false]
(%i3) supcase(c);
(%o3)                           E
(%i4) charp(%);
(%o4)                         true

All functions in stringproc.lisp that return characters, return Maxima-characters. Due to the fact, that the introduced characters are strings of length 1, you can use a lot of string functions also for characters. As seen, supcase is one example.

It is important to know, that the first character in a Maxima-string is at position 1. This is designed due to the fact that the first element in a Maxima-list is at position 1 too. See definitions of charat and charlist for examples.

In applications string-functions are often used when working with files. You will find some useful stream- and print-functions in stringproc.lisp. The following example shows some of the here introduced functions at work.

Example:

openw returns an output stream to a file, printf then allows formatted writing to this file. See printf for details.

(%i1) s: openw("E:/file.txt");
(%o1)                    #<output stream E:/file.txt>
(%i2) for n:0 thru 10 do printf( s, "~d ", fib(n) );
(%o2)                                done
(%i3) printf( s, "~%~d ~f ~a ~a ~f ~e ~a~%", 
              42,1.234,sqrt(2),%pi,1.0e-2,1.0e-2,1.0b-2 );
(%o3)                                false
(%i4) close(s);
(%o4)                                true

After closing the stream you can open it again, this time with input direction. readline returns the entire line as one string. The stringproc package now offers a lot of functions for manipulating strings. Tokenizing can be done by split or tokens.

(%i5) s: openr("E:/file.txt");
(%o5)                     #<input stream E:/file.txt>
(%i6) readline(s);
(%o6)                     0 1 1 2 3 5 8 13 21 34 55 
(%i7) line: readline(s);
(%o7)               42 1.234 sqrt(2) %pi 0.01 1.0E-2 1.0b-2
(%i8) list: tokens(line);
(%o8)           [42, 1.234, sqrt(2), %pi, 0.01, 1.0E-2, 1.0b-2]
(%i9) map( parse_string, list );
(%o9)            [42, 1.234, sqrt(2), %pi, 0.01, 0.01, 1.0b-2]
(%i10) float(%);
(%o10) [42.0, 1.234, 1.414213562373095, 3.141592653589793, 0.01,
                                                     0.01, 0.01]
(%i11) readline(s);
(%o11)                               false
(%i12) close(s)$

readline returns false when the end of file occurs.

Categories:  Strings Share packages Package stringproc


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

77.2 Functions and Variables for input and output

Example:

(%i1) s: openw("E:/file.txt");
(%o1)                     #<output stream E:/file.txt>
(%i2) control: 
"~2tAn atom: ~20t~a~%~2tand a list: ~20t~{~r ~}~%~2t\
and an integer: ~20t~d~%"$
(%i3) printf( s,control, 'true,[1,2,3],42 )$
(%o3)                                false
(%i4) close(s);
(%o4)                                true
(%i5) s: openr("E:/file.txt");
(%o5)                     #<input stream E:/file.txt>
(%i6) while stringp( tmp:readline(s) ) do print(tmp)$
  An atom:          true 
  and a list:       one two three  
  and an integer:   42 
(%i7) close(s)$

Function: close (stream)

Closes stream and returns true if stream had been open.

Function: flength (stream)

Returns the number of elements in stream where stream has to be a stream from or to a file.

Function: fposition (stream)
Function: fposition (stream, pos)

Returns the current position in stream, if pos is not used. If pos is used, fposition sets the position in stream. stream has to be a stream from or to a file and pos has to be a positive number where the first element in stream is in position 1.

Function: freshline ()
Function: freshline (stream)

Writes a new line (to stream), if the position is not at the beginning of a line. See also newline.

Function: get_output_stream_string (stream)

Returns a string containing all the characters currently present in stream which must be an open string-output stream. The returned characters are removed from stream.

Example: See make_string_output_stream .

Categories:  Package stringproc

Function: make_string_input_stream (string)
Function: make_string_input_stream (string, start)
Function: make_string_input_stream (string, start, end)

Returns an input stream which contains parts of string and an end of file. Without optional arguments the stream contains the entire string and is positioned in front of the first character. start and end define the substring contained in the stream. The first character is available at position 1.

(%i1) istream : make_string_input_stream("text", 1, 4);
(%o1)              #<string-input stream from "text">
(%i2) (while (c : readchar(istream)) # false do sprint(c), newline())$
t e x 
(%i3) close(istream)$

Categories:  Package stringproc

Function: make_string_output_stream ()

Returns an output stream that accepts characters. Characters currently present in this stream can be retrieved by get_output_stream_string.

(%i1) ostream : make_string_output_stream();
(%o1)               #<string-output stream 09622ea0>
(%i2) printf(ostream, "foo")$

(%i3) printf(ostream, "bar")$

(%i4) string : get_output_stream_string(ostream);
(%o4)                            foobar
(%i5) printf(ostream, "baz")$

(%i6) string : get_output_stream_string(ostream);
(%o6)                              baz
(%i7) close(ostream)$

Categories:  Package stringproc

Function: newline ()
Function: newline (stream)

Writes a new line (to stream). See sprint for an example of using newline(). Note that there are some cases, where newline() does not work as expected.

Function: opena (file)

Returns an output stream to file. If an existing file is opened, opena appends elements at the end of file.

Function: openr (file)

Returns an input stream to file. If file does not exist, it will be created.

Function: openw (file)

Returns an output stream to file. If file does not exist, it will be created. If an existing file is opened, openw destructively modifies file.

Function: printf (dest, string)
Function: printf (dest, string, expr_1, ..., expr_n)

Produces formatted output by outputting the characters of control-string string and observing that a tilde introduces a directive. The character after the tilde, possibly preceded by prefix parameters and modifiers, specifies what kind of formatting is desired. Most directives use one or more elements of the arguments expr_1, ..., expr_n to create their output.

If dest is a stream or true, then printf returns false. Otherwise, printf returns a string containing the output.

printf provides the Common Lisp function format in Maxima. The following example illustrates the general relation between these two functions.

(%i1) printf(true, "R~dD~d~%", 2, 2);
R2D2
(%o1)                                false
(%i2) :lisp (format t "R~dD~d~%" 2 2)
R2D2
NIL

The following description is limited to a rough sketch of the possibilities of printf. The Lisp function format is described in detail in many reference books. Of good help is e.g. the free available online-manual "Common Lisp the Language" by Guy L. Steele. See chapter 22.3.3 there.

   ~%       new line
   ~&       fresh line
   ~t       tab
   ~$       monetary
   ~d       decimal integer
   ~b       binary integer
   ~o       octal integer
   ~x       hexadecimal integer
   ~br      base-b integer
   ~r       spell an integer
   ~p       plural
   ~f       floating point
   ~e       scientific notation
   ~g       ~f or ~e, depending upon magnitude
   ~h       bigfloat
   ~a       uses Maxima function string
   ~s       like ~a, but output enclosed in "double quotes"
   ~~       ~
   ~<       justification, ~> terminates
   ~(       case conversion, ~) terminates 
   ~[       selection, ~] terminates 
   ~{       iteration, ~} terminates

The directive ~h for bigfloat is no Lisp-standard and is therefore illustrated below.

Note that the directive ~* is not supported.

If dest is a stream or true, then printf returns false. Otherwise, printf returns a string containing the output.

(%i1) printf( false, "~a ~a ~4f ~a ~@r", 
              "String",sym,bound,sqrt(12),144), bound = 1.234;
(%o1)                 String sym 1.23 2*sqrt(3) CXLIV
(%i2) printf( false,"~{~a ~}",["one",2,"THREE"] );
(%o2)                          one 2 THREE 
(%i3) printf(true,"~{~{~9,1f ~}~%~}",mat ),
          mat = args(matrix([1.1,2,3.33],[4,5,6],[7,8.88,9]))$
      1.1       2.0       3.3 
      4.0       5.0       6.0 
      7.0       8.9       9.0 
(%i4) control: "~:(~r~) bird~p ~[is~;are~] singing."$
(%i5) printf( false,control, n,n,if n=1 then 1 else 2 ), n=2;
(%o5)                    Two birds are singing.

The directive ~h has been introduced to handle bigfloats.

~w,d,e,x,o,p@H
 w : width
 d : decimal digits behind floating point
 e : minimal exponent digits
 x : preferred exponent
 o : overflow character
 p : padding character
 @ : display sign for positive numbers
(%i1) fpprec : 1000$
(%i2) printf(true, "|~h|~%", 2.b0^-64)$
|0.0000000000000000000542101086242752217003726400434970855712890625|
(%i3) fpprec : 26$
(%i4) printf(true, "|~h|~%", sqrt(2))$
|1.4142135623730950488016887|
(%i5) fpprec : 24$
(%i6) printf(true, "|~h|~%", sqrt(2))$
|1.41421356237309504880169|
(%i7) printf(true, "|~28h|~%", sqrt(2))$
|   1.41421356237309504880169|
(%i8) printf(true, "|~28,,,,,'*h|~%", sqrt(2))$
|***1.41421356237309504880169|
(%i9) printf(true, "|~,18h|~%", sqrt(2))$
|1.414213562373095049|
(%i10) printf(true, "|~,,,-3h|~%", sqrt(2))$
|1414.21356237309504880169b-3|
(%i11) printf(true, "|~,,2,-3h|~%", sqrt(2))$
|1414.21356237309504880169b-03|
(%i12) printf(true, "|~20h|~%", sqrt(2))$
|1.41421356237309504880169|
(%i13) printf(true, "|~20,,,,'+h|~%", sqrt(2))$
|++++++++++++++++++++|

Function: readchar (stream)

Removes and returns the first character in stream. If the end of file is encountered readchar returns false.

Example: See make_string_input_stream.

Function: readline (stream)

Returns a string containing the characters from the current position in stream up to the end of the line or false if the end of the file is encountered.

Function: sprint (expr_1, …, expr_n)

Evaluates and displays its arguments one after the other `on a line' starting at the leftmost position. The numbers are printed with the '-' right next to the number, and it disregards line length. newline(), which will be autoloaded from stringproc.lisp might be useful, if you whish to place intermediate line breaking.

Example:

(%i1) for n:0 thru 19 do sprint( fib(n) )$
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181
(%i2) for n:0 thru 22 do ( 
         sprint(fib(n)), if mod(n,10)=9 then newline() )$
0 1 1 2 3 5 8 13 21 34 
55 89 144 233 377 610 987 1597 2584 4181 
6765 10946 17711 

Categories:  Package stringproc


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

77.3 Functions and Variables for characters

Function: alphacharp (char)

Returns true if char is an alphabetic character.

Function: alphanumericp (char)

Returns true if char is an alphabetic character or a digit.

Function: ascii (int)

Returns the character corresponding to the ASCII number int. ( -1 < int < 256 )

(%i1) for n from 0 thru 255 do ( 
   tmp: ascii(n), if alphacharp(tmp) then sprint(tmp),
      if n=96 then newline() )$
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
a b c d e f g h i j k l m n o p q r s t u v w x y z

Categories:  Package stringproc

Function: cequal (char_1, char_2)

Returns true if char_1 and char_2 are the same.

Function: cequalignore (char_1, char_2)

Like cequal but ignores case.

Function: cgreaterp (char_1, char_2)

Returns true if the ASCII number of char_1 is greater than the number of char_2.

Function: cgreaterpignore (char_1, char_2)

Like cgreaterp but ignores case.

Function: charp (obj)

Returns true if obj is a Maxima-character. See introduction for example.

Function: cint (char)

Returns the ASCII number of char.

Categories:  Package stringproc

Function: clessp (char_1, char_2)

Returns true if the ASCII number of char_1 is less than the number of char_2.

Function: clesspignore (char_1, char_2)

Like clessp but ignores case.

Function: constituent (char)

Returns true if char is a graphic character and not the space character. A graphic character is a character one can see, plus the space character. (constituent is defined by Paul Graham, ANSI Common Lisp, 1996, page 67.)

(%i1) for n from 0 thru 255 do ( 
tmp: ascii(n), if constituent(tmp) then sprint(tmp) )$
! " #  %  ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B
C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
d e f g h i j k l m n o p q r s t u v w x y z { | } ~

Function: cunlisp (lisp_char)

Converts a Lisp-character into a Maxima-character. (You won't need it.)

Categories:  Package stringproc

Function: digitcharp (char)

Returns true if char is a digit.

Function: lcharp (obj)

Returns true if obj is a Lisp-character. (You won't need it.)

Function: lowercasep (char)

Returns true if char is a lowercase character.

Variable: newline

The newline character.

Variable: space

The space character.

Variable: tab

The tab character.

Function: uppercasep (char)

Returns true if char is an uppercase character.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

77.4 Functions and Variables for strings

Function: base64 (string)

Returns the base64-representation of string as a string.

Example:

(%i1) base64 : base64("foo bar baz");
(%o1)                       Zm9vIGJhciBiYXo=
(%i2) string : base64_decode(base64);
(%o2)                          foo bar baz

Categories:  Package stringproc

Function: base64_decode (base64-string)

Decodes the string base64-string coded in base64 back to the original string.

Example: See base64.

Categories:  Package stringproc

Function: charat (string, n)

Returns the n-th character of string. The first character in string is returned with n = 1.

(%i1) charat("Lisp",1);
(%o1)                           L

Categories:  Package stringproc

Function: charlist (string)

Returns the list of all characters in string.

(%i1) charlist("Lisp");
(%o1)                     [L, i, s, p]
(%i2) %[1];
(%o2)                           L

Categories:  Package stringproc

Function: eval_string (str)

Parse the string str as a Maxima expression and evaluate it. The string str may or may not have a terminator (dollar sign $ or semicolon ;). Only the first expression is parsed and evaluated, if there is more than one.

Complain if str is not a string.

Examples:

(%i1) eval_string ("foo: 42; bar: foo^2 + baz");
(%o1)                       42
(%i2) eval_string ("(foo: 42, bar: foo^2 + baz)");
(%o2)                   baz + 1764

See also parse_string.

Categories:  Package stringproc

Function: md5sum (string)

Returns the md5 checksum of a string as a string. To parse the returned value into an integer please set the input base to 16 and prefix the string by zero.

Example:

(%i1) string : md5sum("foo bar baz");
(%o1)                  ab07acbb1e496801937adfa772424bf7
(%i2) ibase : obase : 16.$

(%i3) integer : parse_string(sconcat(0, string));
(%o3)                 0ab07acbb1e496801937adfa772424bf7

Categories:  Package stringproc

Function: parse_string (str)

Parse the string str as a Maxima expression (do not evaluate it). The string str may or may not have a terminator (dollar sign $ or semicolon ;). Only the first expression is parsed, if there is more than one.

Complain if str is not a string.

Examples:

(%i1) parse_string ("foo: 42; bar: foo^2 + baz");
(%o1)                    foo : 42
(%i2) parse_string ("(foo: 42, bar: foo^2 + baz)");
                                   2
(%o2)          (foo : 42, bar : foo  + baz)

See also eval_string.

Categories:  Package stringproc

Function: scopy (string)

Returns a copy of string as a new string.

Categories:  Package stringproc

Function: sdowncase (string)
Function: sdowncase (string, start)
Function: sdowncase (string, start, end)

Like supcase, but uppercase characters are converted to lowercase.

Categories:  Package stringproc

Function: sequal (string_1, string_2)

Returns true if string_1 and string_2 are the same length and contain the same characters.

Function: sequalignore (string_1, string_2)

Like sequal but ignores case.

Function: sexplode (string)

sexplode is an alias for function charlist.

Categories:  Package stringproc

Function: simplode (list)
Function: simplode (list, delim)

simplode takes a list of expressions and concatenates them into a string. If no delimiter delim is specified, simplode uses no delimiter. delim can be any string.

(%i1) simplode(["xx[",3,"]:",expand((x+y)^3)]);
(%o1)             xx[3]:y^3+3*x*y^2+3*x^2*y+x^3
(%i2) simplode( sexplode("stars")," * " );
(%o2)                   s * t * a * r * s
(%i3) simplode( ["One","more","coffee."]," " );
(%o3)                   One more coffee.

Categories:  Package stringproc

Function: sinsert (seq, string, pos)

Returns a string that is a concatenation of substring (string, 1, pos - 1), the string seq and substring (string, pos). Note that the first character in string is in position 1.

(%i1) s: "A submarine."$
(%i2) concat( substring(s,1,3),"yellow ",substring(s,3) );
(%o2)                  A yellow submarine.
(%i3) sinsert("hollow ",s,3);
(%o3)                  A hollow submarine.

Categories:  Package stringproc

Function: sinvertcase (string)
Function: sinvertcase (string, start)
Function: sinvertcase (string, start, end)

Returns string except that each character from position start to end is inverted. If end is not given, all characters from start to the end of string are replaced.

(%i1) sinvertcase("sInvertCase");
(%o1)                      SiNVERTcASE

Categories:  Package stringproc

Function: slength (string)

Returns the number of characters in string.

Categories:  Package stringproc

Function: smake (num, char)

Returns a new string with a number of num characters char.

(%i1) smake(3,"w");
(%o1)                          www

Categories:  Package stringproc

Function: smismatch (string_1, string_2)
Function: smismatch (string_1, string_2, test)

Returns the position of the first character of string_1 at which string_1 and string_2 differ or false. Default test function for matching is sequal. If smismatch should ignore case, use sequalignore as test.

(%i1) smismatch("seven","seventh");
(%o1)                           6

Categories:  Package stringproc

Function: split (string)
Function: split (string, delim)
Function: split (string, delim, multiple)

Returns the list of all tokens in string. Each token is an unparsed string. split uses delim as delimiter. If delim is not given, the space character is the default delimiter. multiple is a boolean variable with true by default. Multiple delimiters are read as one. This is useful if tabs are saved as multiple space characters. If multiple is set to false, each delimiter is noted.

(%i1) split("1.2   2.3   3.4   4.5");
(%o1)                 [1.2, 2.3, 3.4, 4.5]
(%i2) split("first;;third;fourth",";",false);
(%o2)               [first, , third, fourth]

Categories:  Package stringproc

Function: sposition (char, string)

Returns the position of the first character in string which matches char. The first character in string is in position 1. For matching characters ignoring case see ssearch.

Categories:  Package stringproc

Function: sremove (seq, string)
Function: sremove (seq, string, test)
Function: sremove (seq, string, test, start)
Function: sremove (seq, string, test, start, end)

Returns a string like string but without all substrings matching seq. Default test function for matching is sequal. If sremove should ignore case while searching for seq, use sequalignore as test. Use start and end to limit searching. Note that the first character in string is in position 1.

(%i1) sremove("n't","I don't like coffee.");
(%o1)                   I do like coffee.
(%i2) sremove ("DO ",%,'sequalignore);
(%o2)                    I like coffee.

Categories:  Package stringproc

Function: sremovefirst (seq, string)
Function: sremovefirst (seq, string, test)
Function: sremovefirst (seq, string, test, start)
Function: sremovefirst (seq, string, test, start, end)

Like sremove except that only the first substring that matches seq is removed.

Categories:  Package stringproc

Function: sreverse (string)

Returns a string with all the characters of string in reverse order.

Categories:  Package stringproc

Function: ssearch (seq, string)
Function: ssearch (seq, string, test)
Function: ssearch (seq, string, test, start)
Function: ssearch (seq, string, test, start, end)

Returns the position of the first substring of string that matches the string seq. Default test function for matching is sequal. If ssearch should ignore case, use sequalignore as test. Use start and end to limit searching. Note that the first character in string is in position 1.

(%i1) ssearch("~s","~{~S ~}~%",'sequalignore);
(%o1)                                  4

Categories:  Package stringproc

Function: ssort (string)
Function: ssort (string, test)

Returns a string that contains all characters from string in an order such there are no two successive characters c and d such that test (c, d) is false and test (d, c) is true. Default test function for sorting is clessp. The set of test functions is {clessp, clesspignore, cgreaterp, cgreaterpignore, cequal, cequalignore}.

(%i1) ssort("I don't like Mondays.");
(%o1)                    '.IMaddeiklnnoosty
(%i2) ssort("I don't like Mondays.",'cgreaterpignore);
(%o2)                 ytsoonnMlkIiedda.'   

Categories:  Package stringproc

Function: ssubst (new, old, string)
Function: ssubst (new, old, string, test)
Function: ssubst (new, old, string, test, start)
Function: ssubst (new, old, string, test, start, end)

Returns a string like string except that all substrings matching old are replaced by new. old and new need not to be of the same length. Default test function for matching is sequal. If ssubst should ignore case while searching for old, use sequalignore as test. Use start and end to limit searching. Note that the first character in string is in position 1.

(%i1) ssubst("like","hate","I hate Thai food. I hate green tea.");
(%o1)          I like Thai food. I like green tea.
(%i2) ssubst("Indian","thai",%,'sequalignore,8,12);
(%o2)         I like Indian food. I like green tea.

Categories:  Package stringproc

Function: ssubstfirst (new, old, string)
Function: ssubstfirst (new, old, string, test)
Function: ssubstfirst (new, old, string, test, start)
Function: ssubstfirst (new, old, string, test, start, end)

Like subst except that only the first substring that matches old is replaced.

Categories:  Package stringproc

Function: strim (seq,string)

Returns a string like string, but with all characters that appear in seq removed from both ends.

(%i1) "/* comment */"$
(%i2) strim(" /*",%);
(%o2)                        comment
(%i3) slength(%);
(%o3)                           7

Categories:  Package stringproc

Function: striml (seq, string)

Like strim except that only the left end of string is trimmed.

Categories:  Package stringproc

Function: strimr (seq, string)

Like strim except that only the right end of string is trimmed.

Categories:  Package stringproc

Function: stringp (obj)

Returns true if obj is a string. See introduction for example.

Function: substring (string, start)
Function: substring (string, start, end)

Returns the substring of string beginning at position start and ending at position end. The character at position end is not included. If end is not given, the substring contains the rest of the string. Note that the first character in string is in position 1.

(%i1) substring("substring",4);
(%o1)                        string
(%i2) substring(%,4,6);
(%o2)                          in

Categories:  Package stringproc

Function: supcase (string)
Function: supcase (string, start)
Function: supcase (string, start, end)

Returns string except that lowercase characters from position start to end are replaced by the corresponding uppercase ones. If end is not given, all lowercase characters from start to the end of string are replaced.

(%i1) supcase("english",1,2);
(%o1)                        English

Categories:  Package stringproc

Function: tokens (string)
Function: tokens (string, test)

Returns a list of tokens, which have been extracted from string. The tokens are substrings whose characters satisfy a certain test function. If test is not given, constituent is used as the default test. {constituent, alphacharp, digitcharp, lowercasep, uppercasep, charp, characterp, alphanumericp} is the set of test functions. (The Lisp-version of tokens is written by Paul Graham. ANSI Common Lisp, 1996, page 67.)

(%i1) tokens("24 October 2005");
(%o1)                  [24, October, 2005]
(%i2) tokens("05-10-24",'digitcharp);
(%o2)                     [05, 10, 24]
(%i3) map(parse_string,%);
(%o3)                      [5, 10, 24]

Categories:  Package stringproc


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Charlie & on July, 6 2015 using texi2html 1.76.