Discussion:
Newbie: Request for a style advice
(too old to reply)
Jochen Heller
2019-12-15 22:00:59 UTC
Permalink
Hello dear Lispers,

I'm delving into Common Lisp for a longer time yet since a couple of month (I do not have too much time for it during a day) more efficiently and more continuously with the wonderful textbooks of Barsky (Land of Lisp) and Seibel (Practical Common Lisp).

I finished the first one and will now go on to finish the second. Yet as a little intermezzo I needed to play around by myself a bit.

I wrote two functions, whereby the one that does most of the work became a bit huge.

Since the examples in most of the textbooks I had in my fingers up to now (a longer time ago I accidentally stepped over Brian Harveys very nice Logo textbooks and its follow-up about Scheme with which my journey to Lisp began and let me meander finally to Common Lisp, at first with Touretzky's Gentle Introduction which appeared a lot like Harveys textbooks) are mostly short, I have qualms about bloating the procedures. That is because I do appreciate, that small procedures are more assessible.

However, in the following example it appeared to me more easy to avoid repetition by bloating my workhorse.

When you read it, my cumbersome approach may hit you right into the face. I hope you still can give me a feedback about how to use a better style or whether this is a case, where it is ok to formulate a solution like this. (If this is an important style issue as well: I do like the DO macro which appears somehow more consistent to me for such simple loops than the LOOP. I want to experiment with LOOP in cases to come. Especially, when I start with the practical examples of Practical Common Lisp within a short time.)

After this long opening speech (I'm a chatterbox, I'm afraid), my code:

;;;; Maybe the begin for a library that helps to split arguments apart for evaluating the underlying
;;;; scheme ... someday. At least an apprentice piece for playing around with primitives.

(defun first-sent (text)
"Takes a TEXT of type string as input and returns either its first sentence (if it is of one piece) or a list (if it is divided in several comma separated parts)."
(let* ((pos-full-stop (position #\. text))
(first-sentence (if (not pos-full-stop)
""
(subseq text 0 pos-full-stop)) ) ; extracting the first sentence
(amount-parts (1+ (count #\, first-sentence))) ; commas in sent. + 1 = no. of parts of sent.
(ret-lst nil) ) ; list to return a sentence with clauses
;; auxiliary procedures
(labels ((partsp (sent) ; tests for a sentence with more than one parts
(find #\, sent) )
(remove-first-space (sent) ; handles spaces at the beginning of extracted (partial)
(do ((i 0 (1+ i))) ; sentences
((not (string= sent " " :start1 i :end1 (1+ i))) (subseq sent i) )))
(first-part-sent (sent) ; extracts the first partial sentence of the first sentence
(let ((pos-comma (position #\, sent)))
(remove-first-space (subseq sent 0 pos-comma)) )))
;; body
(if (not (partsp first-sentence)) ; The first sentence is of one piece
(remove-first-space first-sentence) ; return the first sentence as a string
;;
(do ((i amount-parts (1- i)))
((= i 0) (nreverse ret-lst)) ; else return it splitted into a list of strings
(if (not (partsp first-sentence)) ; as soon as the first sentence is eaten up
(push (remove-first-space first-sentence) ret-lst) ; push the last part into the list
(progn ; until then
(push (first-part-sent first-sentence) ret-lst) ; push any part into the list
(setf first-sentence ; and eat up the first sentence part by part
(subseq first-sentence (1+ (position #\, first-sentence))) ))))))))

(defun split-text (text)
"Takes a TEXT of type string as input and returns an array with the single sentences as elements."
(let* ((full-stops (count #\. text))
(container (make-array full-stops :fill-pointer 0))
(tmp text) )
(do ((i full-stops (1- i)))
((= i 0) container)
(vector-push (first-sent tmp) container)
(setf tmp (subseq tmp (1+ (position #\. tmp)))) )))
Jochen Heller
2019-12-15 22:10:42 UTC
Permalink
I'm sorry, I do not find a way to modify my original post right now. I pushed "Post" a bit too early.

So: Thank you very much for your advices. There was something else I was going to add, yet I forgot it while I searched a button to change my post ...

Best regards

Jochen.
Madhu
2019-12-16 02:00:41 UTC
Permalink
Post by Jochen Heller
When you read it, my cumbersome approach may hit you right into the
face. I hope you still can give me a feedback about how to use a
better style or whether this is a case, where it is ok to formulate a
solution like this.
Look perfectly normal "idiomatic" common lisp to me.

Once the function works and is tested it is complete and finished
forever and you can get on with your life.

I find the Aesthetics to judge common lisp are different from other
lisps in the lisp family.
Post by Jochen Heller
(If this is an important style issue as well: I do
like the DO macro which appears somehow more consistent to me for such
simple loops than the LOOP. I want to experiment with LOOP in cases to
come. Especially, when I start with the practical examples of
Practical Common Lisp within a short time.)
I'm sure you will learn to appreciate LOOP and use its full power when
appropriate
Robert L.
2019-12-16 09:24:56 UTC
Permalink
Post by Madhu
I'm sure you will learn to appreciate LOOP
Paul Graham:

I consider Loop one of the worst flaws in CL, and an example
to be borne in mind by both macro writers and language designers.

[In "ANSI Common Lisp", Graham makes the following comments:]

The loop macro was originally designed to help inexperienced
Lisp users write iterative code. Instead of writing Lisp code,
you express your program in a form meant to resemble English,
and this is then translated into Lisp. Unfortunately, loop is
more like English than its designers ever intended: you can
use it in simple cases without quite understanding how it
works, but to understand it in the abstract is almost
impossible.
....
the ANSI standard does not really give a formal specification
of its behavior.
....
The first thing one notices about the loop macro is that it
has syntax. A loop expression contains not subexpressions but
clauses. The clauses are not delimited by parentheses;
instead, each kind has a distinct syntax. In that, loop
resembles traditional Algol-like languages. But the other
distinctive feature of loop, which makes it as unlike Algol as
Lisp, is that the order in which things happen is only
loosely related to the order in which the clauses occur.
....
For such reasons, the use of loop cannot be recommended.
Jochen Heller
2019-12-17 18:05:08 UTC
Permalink
Post by Robert L.
For such reasons, the use of loop cannot be recommended.
Well, at this stage, after I became more used to DO it appears nice to me. At first hand it appeared quite confusing. Yet by now I really appreciate its consistency.

Anyway, I'm open and am sure, that Seibel will show convincing examples.

If with more experience in own projects I tend to prefer one or the other, I will discover my own tendency.

I do not like discussions about matter of taste. It seems that LOOP is used quite frequently. And also it seems to be very helpful with hash tables. If I discover good patterns for DO maybe I will prefer them. Yet I cannot make a decision for now.
Udyant Wig
2019-12-16 05:23:52 UTC
Permalink
Post by Jochen Heller
(defun split-text (text)
"Takes a TEXT of type string as input and returns an array with the single sentences as elements."
(let* ((full-stops (count #\. text))
(container (make-array full-stops :fill-pointer 0))
(tmp text) )
(do ((i full-stops (1- i)))
((= i 0) container)
(vector-push (first-sent tmp) container)
(setf tmp (subseq tmp (1+ (position #\. tmp)))) )))
Is there a particular reason you are using an array in SPLIT-TEXT?

Udyant Wig
Jochen Heller
2019-12-17 16:56:37 UTC
Permalink
Post by Udyant Wig
Is there a particular reason you are using an array in SPLIT-TEXT?
Yes.

First of all, since using lists is so tempting that I want to practise using arrays for a while, too.

Second, I do not know, whether I will use this potential future library for huge texts. If lists become so inefficient with more than 12 cons cells or so, I would rather be used to arrays from the beginning.

At this early stage I assume, that the sentences will not contain much more than maybe 7 clauses (I'm thinking of German sentences).

Thus, for routines to evaluate the sentences with more clauses, list functions might still be efficient, after these lists were read from the fast array.

In the end, it's all playing around for practice, for now. :-)
Udyant Wig
2019-12-18 05:37:41 UTC
Permalink
Post by Jochen Heller
If lists become so inefficient with more than 12 cons cells or so,
This seems very specific. Have you measured this?

Udyant Wig
--
We make our discoveries through our mistakes: we watch one another's
success: and where there is freedom to experiment there is hope to
improve.
-- Arthur Quiller-Couch
Jochen Heller
2019-12-18 17:51:42 UTC
Permalink
Post by Udyant Wig
Post by Jochen Heller
If lists become so inefficient with more than 12 cons cells or so,
This seems very specific. Have you measured this?
Me, no. And please: I am no professional programmer or computer scientist. I learn Lisp through books and experiments and want to use it for legal scholarly applications (didactical and analytical). So please do not expect any sound computer scientific proof or argument. I only search for advices from experienced programmers.

Yet ... I guess Barski in Land of Lisp wrote of about 12 cons cells or so. I believe to remember not much more than 10 cons cells (or was it the Google Style Guide?). And Seibel of course highly recommends all other data types over lists (except for some XML like structures or macros which are both of course still interesting for me).
t***@google.com
2019-12-18 18:29:13 UTC
Permalink
Post by Jochen Heller
Post by Udyant Wig
Post by Jochen Heller
If lists become so inefficient with more than 12 cons cells or so,
This seems very specific. Have you measured this?
Me, no. And please: I am no professional programmer or computer scientist. I learn Lisp through books and experiments and want to use it for legal scholarly applications (didactical and analytical). So please do not expect any sound computer scientific proof or argument. I only search for advices from experienced programmers.
Yet ... I guess Barski in Land of Lisp wrote of about 12 cons cells or so. I believe to remember not much more than 10 cons cells (or was it the Google Style Guide?). And Seibel of course highly recommends all other data types over lists (except for some XML like structures or macros which are both of course still interesting for me).
I'm also curious about what sort of efficiency this refers to.
Because of their fundamental use in Lisp, lists have been highly optimized.

Now there are various forms of efficiency that one could consider when
comparing lists and vectors.
1. Access to a specific indexed-value. Vectors will win.
2. Size of data structure. Vectors will win over all but very short lists.
Vectors have a higher fixed overhead, whereas lists have no fixed overhead.
But each list element takes up more space.
3. Adding values to the front. Lists will win big.
4. Adding values to the end. I would think this is tie as long as you keep a
separate pointer to the last cons cell and can destructively modify the
list. The value for extensible vectors should be amortized constant time,
but for one particular extension, it could be longer as the vector may
need to copy values to a new contiguous block of memory.
5. Iteration over values. I would expect vectors to win, mainly through cache
effects and locality of reference. List traversal is one of the highly
optimized aspects of list implementations, so it should be quick. But not
necessarily cache-friendly. Copying GC will help with this a bit, though.

There are probably some other cases I haven't thought of.

From some old work on the Loom project, we determined empirically that small
association lists (alists) would be faster for lookup than hashtables. But
past a certain size (which I don't recall) hashtables would be faster. Running
down the list beat the cost of computing the hashfunction. And alists generally
had a smaller memory footprint.
t***@google.com
2019-12-18 22:28:31 UTC
Permalink
This - and the former posts are a wonderful response. I hope I will be just as
lucky with any further questions on this list or group :-)
There are several very knowledgeable and helpful Lisp programmers who post here.
Some of them have over 40 years of experience with the language.

There are also a couple of regulars whose posts are pretty useless, but you
should be able to figure out who they are pretty quickly.
Jochen Heller
2019-12-18 22:45:16 UTC
Permalink
Post by t***@google.com
There are several very knowledgeable and helpful Lisp programmers who post here.
Some of them have over 40 years of experience with the language.
I'm looking forward to. In my tempo, I'll need my next 40 years of experience presumably until I satisfied a part of my epistemological interest with the help of Common Lisp yet meanwhile I will gladly try to find some answers here for finding the way out of the many vales of tears I will be trapped in.
Udyant Wig
2019-12-19 06:02:52 UTC
Permalink
On 12/19/19 2:44 AM, Jochen Heller wrote:
[snip]
Of course it appears obvious to use lists and alists at early stages
and rewrite the functions later for vectors and hash tables as it is
suggested in the textbooks. Yet as a rookie I try to get a feeling for
all of them right from the beginning. Let's say I'd like to be good
friends with all of them.
A point that you may or may not have noted: an advantage of Common
Lisp's design is that lists and vectors are subtypes of sequences. So
you can program for sequences, making an initial choice of underlying
data structure, and then change to the other (if need arises), all the
while retaining the interface.

As a trivial example:
---
(defvar *list* (list 1000 2000 3000 4000 5000))
(defvar *vector* (vector 100 200 300 400 500))

(defun random-element (sequence)
(let ((index (random (length sequence))))
(elt sequence index)))

(random-element *list*) => <a random element of *list*>
(random-element *vector*) => <a random elemnt of *vector*>
---

Here ELT is the function that applies to sequences in general, be they
lists or vectors.

Udyant Wig
--
We make our discoveries through our mistakes: we watch one another's
success: and where there is freedom to experiment there is hope to
improve.
-- Arthur Quiller-Couch
Jochen Heller
2019-12-21 22:45:45 UTC
Permalink
Post by Udyant Wig
A point that you may or may not have noted: an advantage of Common
Lisp's design is that lists and vectors are subtypes of sequences.
[...]
Here ELT is the function that applies to sequences in general, be they
lists or vectors.
That is a good reminder. Thank you.
t***@google.com
2019-12-23 17:52:06 UTC
Permalink
Post by Jochen Heller
Post by Udyant Wig
A point that you may or may not have noted: an advantage of Common
Lisp's design is that lists and vectors are subtypes of sequences.
[...]
Here ELT is the function that applies to sequences in general, be they
lists or vectors.
That is a good reminder. Thank you.
I will point out that you have to be careful about using ELT (or NTH) with
lists. Unlike with vectors or arrays, where ELT is O(1), ELT is O(N) with lists.
So you can use it for exploration, but it carries a performance penalty.

So, for example, if you were to write

(defun silly-reverse-list (input)
(let ((result nil))
(dolist (item input result)
(push item result))))

you have an O(N) algorithm which only works on lists. If instead you write

(defun silly-reverse-sequence (input)
(let ((result nil))
(dotimes (i (length input) result)
(push (elt input i) result))))

you have an algorithm that would be O(N) for vectors, but O(N^2) for lists.

I'll note in passing that LOOP uses different keywords for iterating over
lists and vectors, namely "in" and "across", respectively. It is too bad that
the LOOP iterator definition functionality never made it into Common Lisp.

Unfortunately, the ITERATE package is also similar, either using different
keywords for vectors and lisp or falling back on LENGTH and ELT.
Barry Margolin
2019-12-24 16:44:53 UTC
Permalink
Post by t***@google.com
Post by Jochen Heller
Post by Udyant Wig
A point that you may or may not have noted: an advantage of Common
Lisp's design is that lists and vectors are subtypes of sequences.
[...]
Here ELT is the function that applies to sequences in general, be they
lists or vectors.
That is a good reminder. Thank you.
I will point out that you have to be careful about using ELT (or NTH) with
lists. Unlike with vectors or arrays, where ELT is O(1), ELT is O(N) with lists.
So you can use it for exploration, but it carries a performance penalty.
So, for example, if you were to write
(defun silly-reverse-list (input)
(let ((result nil))
(dolist (item input result)
(push item result))))
you have an O(N) algorithm which only works on lists. If instead you write
(defun silly-reverse-sequence (input)
(let ((result nil))
(dotimes (i (length input) result)
(push (elt input i) result))))
you have an algorithm that would be O(N) for vectors, but O(N^2) for lists.
I'll note in passing that LOOP uses different keywords for iterating over
lists and vectors, namely "in" and "across", respectively. It is too bad that
the LOOP iterator definition functionality never made it into Common Lisp.
Unfortunately, the ITERATE package is also similar, either using different
keywords for vectors and lisp or falling back on LENGTH and ELT.
OTOH, unless the lists are long, the impact of O(N^2) is probably not
huge these days. Write your code in the clearest way, avoiding premature
optimization.
--
Barry Margolin, ***@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
Robert L.
2019-12-16 09:20:40 UTC
Permalink
Post by Jochen Heller
(defun first-sent (text)
"Takes a TEXT of type string as input and returns
either its first sentence (if it is of one piece) or
a list (if it is divided in several comma separated
parts)."
(let* ((pos-full-stop (position #\. text))
(first-sentence (if (not pos-full-stop)
""
(subseq text 0 pos-full-stop)) )
; extracting the first sentence
(amount-parts (1+ (count #\, first-sentence)))
; commas in sent. + 1 = no. of parts of sent.
(ret-lst nil) ) ; list to return a sentence with clauses
;; auxiliary procedures
(labels ((partsp (sent)
; tests for a sentence with more than one parts
(find #\, sent) )
(remove-first-space (sent)
; handles spaces at the beginning of extracted (partial)
(do ((i 0 (1+ i))) ; sentences
((not (string= sent " " :start1 i :end1 (1+ i)))
(subseq sent i) )))
(first-part-sent (sent)
; extracts the first partial sentence of the first sentence
(let ((pos-comma (position #\, sent)))
(remove-first-space (subseq sent 0 pos-comma)) )))
;; body
(if (not (partsp first-sentence))
; The first sentence is of one piece
(remove-first-space first-sentence)
; return the first sentence as a string
;;
(do ((i amount-parts (1- i)))
((= i 0) (nreverse ret-lst))
; else return it splitted into a list of strings
(if (not (partsp first-sentence))
; as soon as the first sentence is eaten up
(push (remove-first-space first-sentence) ret-lst)
; push the last part into the list
(progn ; until then
(push (first-part-sent first-sentence) ret-lst)
; push any part into the list
(setf first-sentence
; and eat up the first sentence part by part
(subseq first-sentence
(1+ (position #\, first-sentence)))))))))))
Testing:

* (first-sent "Foo, on, the bar")

debugger invoked on a SB-KERNEL:BOUNDING-INDICES-BAD-ERROR in thread

* (first-sent "Foo on the bar.")

"Foo on the bar"

* (first-sent " Foo on the bar. xxx. ")

"Foo on the bar"

* (first-sent " Foo, on, the bar. xxx. ")

("Foo" "on" "the bar")

* (first-sent " Foo, on, the bar,. xxx. ")

debugger invoked on a SB-KERNEL:BOUNDING-INDICES-BAD-ERROR in thread

* (first-sent " Foo, on, the bar, . xxx. ")

debugger invoked on a SB-KERNEL:BOUNDING-INDICES-BAD-ERROR in thread


Gauche, Chicken, or Racket.

(use srfi-13) ; String functions for Gauche or Chicken
(use srfi-14) ; Character sets for Gauche or Chicken
or
(require srfi/13) ; String functions for Racket
(require srfi/14) ; Character sets for Racket

(define (first-sentence text)
(let* ((p (string-index text #\.))
(text (substring text 0 p)))
(map string-trim-both
(string-tokenize
text
(char-set-complement (char-set #\,))))))

gosh> (first-sentence "Foo on the bar.")
("Foo on the bar")
gosh> (first-sentence "Foo on the bar. xxx.")
("Foo on the bar")
gosh> (first-sentence " Foo, on , the bar. xxx.")
("Foo" "on" "the bar")
gosh> (first-sentence " Foo, on , the bar,. xxx.")
("Foo" "on" "the bar")
gosh> (first-sentence " Foo, on , the bar, . xxx.")
("Foo" "on" "the bar" "")
Jochen Heller
2019-12-17 16:58:16 UTC
Permalink
Many thanks for these additional test cases.

Yes, I didn't think of wrong punctuation only of two many spaces.
Robert Munyer
2019-12-16 23:59:50 UTC
Permalink
Post by Jochen Heller
I hope you still can give me a feedback about how to use a better
style or whether this is a case, where it is ok to formulate a
solution like this.
A very old tradition, which might be useful to you: if you start
by tokenizing your string into a list of symbols, then you can do
most of the work with symbol manipulation, which is often nicer
than string manipulation.

? (defparameter *input*
(concatenate 'string
"i need some help, that much seems certain. perhaps i"
" could learn to get along with my mother. you are not very"
" aggressive, but i think you don't want me to notice that"))
*INPUT*

? (tokenize *input*)
(I NEED SOME HELP |,| THAT MUCH SEEMS CERTAIN |.| PERHAPS I COULD
LEARN TO GET ALONG WITH MY MOTHER |.| YOU ARE NOT VERY AGGRESSIVE
|,| BUT I THINK YOU |DON'T| WANT ME TO NOTICE THAT)

? (split-at '|.| *)
((I NEED SOME HELP |,| THAT MUCH SEEMS CERTAIN)
(PERHAPS I COULD LEARN TO GET ALONG WITH MY MOTHER)
(YOU ARE NOT VERY AGGRESSIVE |,| BUT I THINK YOU |DON'T| WANT ME
TO NOTICE THAT))

? (mapcar (lambda (sentence) (split-at '|,| sentence)) *)
(((I NEED SOME HELP) (THAT MUCH SEEMS CERTAIN))
((PERHAPS I COULD LEARN TO GET ALONG WITH MY MOTHER))
((YOU ARE NOT VERY AGGRESSIVE)
(BUT I THINK YOU |DON'T| WANT ME TO NOTICE THAT)))

? (format t "~{~%Sentence: ~{~% Phrase:~(~{ ~a~}~)~}~}~2%" *)

Sentence:
Phrase: i need some help
Phrase: that much seems certain
Sentence:
Phrase: perhaps i could learn to get along with my mother
Sentence:
Phrase: you are not very aggressive
Phrase: but i think you don't want me to notice that

NIL

? (defun first-sentence (paragraph) (first paragraph))
FIRST-SENTENCE

? (first-sentence ***)
((I NEED SOME HELP) (THAT MUCH SEEMS CERTAIN))

? (format t "~%~@(~{~{~a~^ ~}~^, ~}.~)~2%" *)

I need some help, that much seems certain.

NIL

You'll notice that I made two passes through the tokenized input,
one to find sentences and another to find phrases. You could do
both of those things (and more) in a single pass, by using a parser
instead of a simple splitter. There are parser libraries that you
can download, or you could write your own simple parser.
Post by Jochen Heller
"Takes a TEXT of type string as input and returns either its
first sentence (if it is of one piece) or a list (if it is
divided in several comma separated parts)."
When you don't know whether your function's value will represent
one X or several Xs, it's usually better to have it return a list
of Xs in every situation, including when there is only one X.

P.S. SPLIT-AT and TOKENIZE are not built-in. SPLIT-AT can be
done with about 10 lines of code, and TOKENIZE can be done with
readtable tricks.
--
-- Robert Munyer code below generates e-mail address

(format nil "~(~{~a~^ ~}~)" (reverse `(com dot munyer at ,(* 175811 53922))))
Jochen Heller
2019-12-17 17:01:22 UTC
Permalink
Post by Robert Munyer
A very old tradition, which might be useful to you: if you start
by tokenizing your string into a list of symbols, then you can do
most of the work with symbol manipulation, which is often nicer
than string manipulation.
Thank you veery much for that advice. It seems that experimenting with it could be very satisfying.
Jochen Heller
2019-12-17 17:07:47 UTC
Permalink
Post by Robert Munyer
When you don't know whether your function's value will represent
one X or several Xs, it's usually better to have it return a list
of Xs in every situation, including when there is only one X.
Oh and that one two times too. Makes sense if I think of the further processing :-)
Robert L.
2019-12-17 20:56:06 UTC
Permalink
Post by Robert Munyer
(format nil "~(~{~a~^ ~}~)" (reverse `(com dot munyer at ,(* 175811 53922))))
(regexp-replace*
(string-join
(reverse (map x->string `(com dot munyer at ,(* 175811 53922))))
"")
"dot" "." "at" "@")

===>
"***@munyer.com"
Blair Vidakovich
2019-12-17 22:28:14 UTC
Permalink
Post by Robert Munyer
Post by Jochen Heller
I hope you still can give me a feedback about how to use a better
style or whether this is a case, where it is ok to formulate a
solution like this.
A very old tradition, which might be useful to you: if you start
by tokenizing your string into a list of symbols, then you can do
most of the work with symbol manipulation, which is often nicer
than string manipulation.
? (defparameter *input*
(concatenate 'string
"i need some help, that much seems certain. perhaps i"
" could learn to get along with my mother. you are not very"
" aggressive, but i think you don't want me to notice that"))
*INPUT*
? (tokenize *input*)
(I NEED SOME HELP |,| THAT MUCH SEEMS CERTAIN |.| PERHAPS I COULD
LEARN TO GET ALONG WITH MY MOTHER |.| YOU ARE NOT VERY AGGRESSIVE
|,| BUT I THINK YOU |DON'T| WANT ME TO NOTICE THAT)
? (split-at '|.| *)
((I NEED SOME HELP |,| THAT MUCH SEEMS CERTAIN)
(PERHAPS I COULD LEARN TO GET ALONG WITH MY MOTHER)
(YOU ARE NOT VERY AGGRESSIVE |,| BUT I THINK YOU |DON'T| WANT ME
TO NOTICE THAT))
? (mapcar (lambda (sentence) (split-at '|,| sentence)) *)
(((I NEED SOME HELP) (THAT MUCH SEEMS CERTAIN))
((PERHAPS I COULD LEARN TO GET ALONG WITH MY MOTHER))
((YOU ARE NOT VERY AGGRESSIVE)
(BUT I THINK YOU |DON'T| WANT ME TO NOTICE THAT)))
? (format t "~{~%Sentence: ~{~% Phrase:~(~{ ~a~}~)~}~}~2%" *)
Phrase: i need some help
Phrase: that much seems certain
Phrase: perhaps i could learn to get along with my mother
Phrase: you are not very aggressive
Phrase: but i think you don't want me to notice that
NIL
? (defun first-sentence (paragraph) (first paragraph))
FIRST-SENTENCE
? (first-sentence ***)
((I NEED SOME HELP) (THAT MUCH SEEMS CERTAIN))
I need some help, that much seems certain.
NIL
You'll notice that I made two passes through the tokenized input,
one to find sentences and another to find phrases. You could do
both of those things (and more) in a single pass, by using a parser
instead of a simple splitter. There are parser libraries that you
can download, or you could write your own simple parser.
Post by Jochen Heller
"Takes a TEXT of type string as input and returns either its
first sentence (if it is of one piece) or a list (if it is
divided in several comma separated parts)."
When you don't know whether your function's value will represent
one X or several Xs, it's usually better to have it return a list
of Xs in every situation, including when there is only one X.
P.S. SPLIT-AT and TOKENIZE are not built-in. SPLIT-AT can be
done with about 10 lines of code, and TOKENIZE can be done with
readtable tricks.
I really agree with this -- a few of the textbooks I have been reading
lately have recommended that one develops their datastructure as
concisely and logically as possible, and avoid as much of the imperative
programming paradigm as possible.

I don't know much LISP, but I do enjoy making as much of my program
state as immutable as possible :-)
Udyant Wig
2019-12-18 05:32:27 UTC
Permalink
On 12/18/19 3:58 AM, Blair Vidakovich wrote:
[snip]
and avoid as much of the imperative programming paradigm as possible.
What alternatives were suggested? How much should they be used in a
project -- partly or wholly?

Udyant Wig
--
We make our discoveries through our mistakes: we watch one another's
success: and where there is freedom to experiment there is hope to
improve.
-- Arthur Quiller-Couch
Blair Vidakovich
2019-12-19 02:32:40 UTC
Permalink
This might perhaps seem a little below your level, but the texts that
referred to this idea of using as little imperative paradigm
programming, as as much of the functional paradigm as possible are the
ones that I have linked in this message!

The first is a book for beginners on making LISP games, and it goes
through a lot of styles you can use in LISP, and recommends the use of a
couple:

https://doc.lagout.org/programmation/Lisp/Land%20of%20Lisp_%20Learn%20to%20Program%20in%20Lisp%2C%20One%20Game%20at%20a%20Time%20[Barski%202010-11-15].pdf

(Sorry about the terrible link linting..)

The CMU Common LISP gentle introduction to symbolic computation doesn't
mention it explicitly, but if you work through the book you will end up
doing very little imperative coding - you will almost never
destroy/change any memory:

https://www.cs.cmu.edu/~dst/LispBook/book.pdf

LISP is not like Haskell, I am told, and you are not forced to program
in a purely functional/declarative paradigm, so you certainly can have
data mutability.

But I quite like using LISP as functionally/declaratively as possible,
because I think it is good practice. I learned a little about how to do
it in the language 'R' actually - you are usually always copying memory
and leaving behind the input intact.
Udyant Wig <***@gmail.com> writes:

BUT - and you'll see in the PDF I attached on beginner LISP games, you
actually do need to have data mutability sometimes. It usually makes the
project easier overall, so claims the author.

Just anecdotally, I really dislike the imperative programming
paradigm. It does not really gel with the way I think of computers, and
when I discovered LISP (as well as Smalltalk), I was overjoyed at the
fact I could leave imperative programming behind.

Hopefully this answers your question -- take all of this completely
relatively, none of this is actually meant to be a recommendation to do
anything the way I am saying, I just find LISP's capacity to leave data
unchanged and immutable to be a much more beautiful way of doing
computation :-)
Udyant Wig
2019-12-19 06:38:35 UTC
Permalink
On 12/19/19 8:02 AM, Blair Vidakovich wrote:
[snip]
Post by Blair Vidakovich
Just anecdotally, I really dislike the imperative programming
paradigm. It does not really gel with the way I think of computers,
and when I discovered LISP (as well as Smalltalk), I was overjoyed at
the fact I could leave imperative programming behind.
What is the way you think of computers?

Udyant Wig
--
We make our discoveries through our mistakes: we watch one another's
success: and where there is freedom to experiment there is hope to
improve.
-- Arthur Quiller-Couch
Blair Vidakovich
2019-12-19 21:17:40 UTC
Permalink
Post by Udyant Wig
[snip]
Post by Blair Vidakovich
Just anecdotally, I really dislike the imperative programming
paradigm. It does not really gel with the way I think of computers,
and when I discovered LISP (as well as Smalltalk), I was overjoyed at
the fact I could leave imperative programming behind.
What is the way you think of computers?
Udyant Wig
That's a really broad question!!

I suppose I think of computers as calculating units connected to banks
of stored programs. I wish there was another kind of computer I could
think of, and there probably is another, better kind of computer than
the "stored program computer".

But even if it is just series of instructions passed into calculating
system units, that is still very exciting to me.

I really need to look more into how LISP machines function - I actually
want to build a physical LISP machine, I think that could be really
cool! :-)

Blair.
t***@google.com
2019-12-20 00:01:56 UTC
Permalink
Post by Blair Vidakovich
I really need to look more into how LISP machines function - I actually
want to build a physical LISP machine, I think that could be really
cool! :-)
Some of the list readers worked on the Lisp machine.

My somewhat more cursory knowledge of them was that they were not hugely
different from non-Lisp machine architectures but with some features to make
an efficient lisp implementation easier:
* Extra tag bits to help with type tagging of data and data addresses.
* Microcode instructions to support basic lisp operations like CAR, CDR, CONS, etc.

I'm not sure if there was anything for GC support.

Lisp machines fell out of favor for a couple of reasons:
* Price. They were niche machines and didn't get the benefits of economies of
scale when individual workstations really took off. Sun did well.
* They were more specialized and lost out to the more general purpose
workstations.
* The implementors of Lisp on standard hardware came up with a number of good
techniques to produce efficient Lisp systems without needing the specialized
hardware. This then benefited from the price advantage of stock hardware.
Blair Vidakovich
2019-12-22 13:21:06 UTC
Permalink
Post by t***@google.com
Post by Blair Vidakovich
I really need to look more into how LISP machines function - I actually
want to build a physical LISP machine, I think that could be really
cool! :-)
Some of the list readers worked on the Lisp machine.
My somewhat more cursory knowledge of them was that they were not hugely
different from non-Lisp machine architectures but with some features to make
* Extra tag bits to help with type tagging of data and data addresses.
* Microcode instructions to support basic lisp operations like CAR, CDR, CONS, etc.
I'm not sure if there was anything for GC support.
* Price. They were niche machines and didn't get the benefits of economies of
scale when individual workstations really took off. Sun did well.
* They were more specialized and lost out to the more general purpose
workstations.
* The implementors of Lisp on standard hardware came up with a number of good
techniques to produce efficient Lisp systems without needing the specialized
hardware. This then benefited from the price advantage of stock hardware.
Oh wow! This is really informative!!

Thanks so much for the considered post. Do you have any extra reading? I
am currently acting like a sponge and am trying to take in as much
information about LISP as possible. Moving over to EMACS for quite a lot
of my online activity has definitely helped, as well - it forces me to
use a lot of a LISP-flavoured language everyday :-)
Jochen Heller
2019-12-18 21:27:25 UTC
Permalink
By the way:

Besides the many local functions I didn't like to SETF local variables yet had no other idea at first hand.

Now I think of another function which collects the positions of the full stops in a text (which may not contain "...") and use them to iterate through the string instead of eating it up. Maybe then I don't even use DO anymore and can work with DOLIST. I'll try ...
Jochen Heller
2019-12-21 22:50:09 UTC
Permalink
Playing around and brainstorming let me encounter strange behaviour to contemplate:

(defun full-stop-positions (text)
"Takes a TEXT of type string as input and returns a list of the positions of the full stop characters."
;;(let ((pos-lst '(0)))
;; (dotimes (i (length text)) ; Weird: If I call the procedure with "bar." first
;; (when (equal (elt text i) #\.) ; it returns as expected (0 3), yet if I call it a
;; (push i pos-lst) )) ; second time with "" it returns (3 0). A third time
;;(nreverse pos-lst) )) ; with "" returns (0) as it should.
; ??? Is it somehow a side effect of the closure
; which keeps the local variables alive???

(do ((i (length text (1- i)) ; counting down avoids NREVERSE at the end :-)
(pos-lst '() (if (not (equal (elt text i) #\.)) ; KLUDGE :-/ until I know, why
(cons -1 pos-lst) ; (when (equal (elt text i) #\.)
(cons i pos-lst)) )) ; (cons i pos-lst) ) doesn't work
((= i 0) (cons 0 (remove -1 pos-lst))) ))
;; How is a variable assigned differently in DO? Isn't it eventually expanded to SETQ as well?
Kaz Kylheku
2019-12-22 00:38:10 UTC
Permalink
Post by Jochen Heller
(defun full-stop-positions (text)
"Takes a TEXT of type string as input and returns a list of the positions of the full stop characters."
;;(let ((pos-lst '(0)))
;; (dotimes (i (length text)) ; Weird: If I call the procedure with "bar." first
;; (when (equal (elt text i) #\.) ; it returns as expected (0 3), yet if I call it a
;; (push i pos-lst) )) ; second time with "" it returns (3 0). A third time
;;(nreverse pos-lst) )) ; with "" returns (0) as it should.
Unlike reverse, which returns fresh (or at least unmutated) structure,
nreverse is permitted to mutate the input to achieve the output,
which is why that function exists.

Though most of ps-lst is freshly allocated on each call to the
function, it terminates with (0), which is produced by the literal
constant '(0). Mutating that object makes the program self-modifying,
which is undefined behavior.

Try (pos-lst (list 0)). Another idea is that since the 0 will end up at
the front of the final list, start with an empty pos-lst, and then, at
the end return (cons 0 (nreverse pos-lst))
Jochen Heller
2019-12-22 07:42:53 UTC
Permalink
Post by Kaz Kylheku
Unlike reverse, which returns fresh (or at least unmutated) structure,
nreverse is permitted to mutate the input to achieve the output,
which is why that function exists.
Though most of ps-lst is freshly allocated on each call to the
function, it terminates with (0), which is produced by the literal
constant '(0). Mutating that object makes the program self-modifying,
which is undefined behavior.
Aaaaaaahhhh.
Post by Kaz Kylheku
Try (pos-lst (list 0)).
Thank you!
Post by Kaz Kylheku
Another idea is that since the 0 will end up at
the front of the final list, start with an empty pos-lst, and then, at
the end return (cons 0 (nreverse pos-lst))
Yeah, that solution came to me in the second DO.
Loading...