I’ve seen several ObjectStudio applications, that do heavy String manipulation. Most of the time, String>>+ is used to concatenate two strings.  This method is pretty expensive. Like an Array>>add: a new String is created and both Strings are copied into.  Especially inside a look, this overhead of creating and destroying temporary instances just adds up.  Instead of String>>+ or String>>, I recommend to use Streams.

IMHO, the easiest way to create a WriteStream is

writeStream := String new writeStream.

In order to prove my statement, I rewrote String>>breakUsing: using Streams.
Before:

oldBreakUsing: aString
| ans subpart |
ans := OrderedCollection new.
subpart := ''.
self do:
[:ch |
(aString includes: ch)
ifTrue:
[subpart notEmpty ifTrue: [ans add: subpart].
subpart := '' + ch]
ifFalse: [subpart := subpart + ch]].
subpart notEmpty ifTrue: [ans add: subpart].
^ans asArray

After:

breakUsing: aString
| ans subpart |
ans := OrderedCollection new.
subpart := String new writeStream.
self do:
[:ch |
((aString includes: ch) and: [subpart notEmpty])
ifTrue:
 [ans add: subpart contents.
subpart reset].
subpart nextPut: ch].
subpart notEmpty ifTrue: [ans add: subpart contents].
 ^ans asArray

The test is very simple:

Time millisecondsToRun: [100000 timesRepeat: [
'abbbbbbbbbbc-dfffffffffffef-gggggggggggggggghi-jjjjjjjjjjjjkl-mmmmmmmmmmmmno-ppppppppppqr-stu-vw-xzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzyz' breakUsing: '-' ]].

Time millisecondsToRun: [100000 timesRepeat: [
'abbbbbbbbbbc-dfffffffffffef-gggggggggggggggghi-jjjjjjjjjjjjkl-mmmmmmmmmmmmno-ppppppppppqr-stu-vw-xzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzyz' oldBreakUsing: '-' ]].

The new code takes 1683ms vs. the 6148ms it took the old implementation to run.  Now I know, ObjectStudio still uses String>>+ all over the place, but when something catches our eye, we try to change it.  Maybe you can tell us, which area is especially painful for you?

Andreas

Advertisements