Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Benefits of Dynamic Typing

31 views
Skip to first unread message

Marshall Spight

unread,
Sep 24, 2005, 3:56:29 PM9/24/05
to
Hi all,

I am curious as to what people see as the primary benefits
of dynamic typing. I have read lots of threads in which
the static and dynamic advocates debate, and I find them
largely unsatisfying. In general, it strikes me that the
advocates of one camp do a poor job of characterizing the
costs and benefits of the other camp, which should perhaps
not be surprising. That is why I am more interested in
hearing the benefits of dynamic typing from those who
practice it. I am sure it will be different for
different people.

I would like to hear what the dynamic proponents consider
the strengths of dynamic typing, and perhaps also what
aspects of it don't matter so much. For example, I have
heard people talk of the benefits of being able to execute
code before it is fully elaborated; in a dynamic language,
one never needs to code stub methods just to execute what
one has coded so far. I can see how this would be of value.
IIUC, in a dynamic language, a variable may contain values
of different types over its lifespan; it *this* valuable?
It strikes me that it might be confusing.

I am hopeful (but not confident) of not starting a flamewar,
despite the contentiousness of the topic. My experience with
this newsgroup is that it is full of highly educated,
well-mannered folk, with good perspective on language design,
which is why I picked this forum to ask the question. I
hope we can concentrate on statements in the positive form
("I like dynamically-typed language X because I can do Y.")
rather than in the negative form ("I dislike statically-typed
language X because I can't do Y.)

Also, if one has any references to good books or papers on
the topic, I'd be appreciative. I have ten books on my shelf
about various topics in type theory; they are all of them
written from a static-typing perspective. I have read (much of)
CTM, but that's not really what I mean; while the book's
code is in a dynamically typed language, the book isn't *about*
dynamic typing.

Thanks,


Marshall

Jon Harrop

unread,
Sep 24, 2005, 10:07:58 PM9/24/05
to
Marshall Spight wrote:
> I am curious as to what people see as the primary benefits
> of dynamic typing. I have read lots of threads in which
> the static and dynamic advocates debate, and I find them
> largely unsatisfying. In general, it strikes me that the
> advocates of one camp do a poor job of characterizing the
> costs and benefits of the other camp, which should perhaps
> not be surprising. That is why I am more interested in
> hearing the benefits of dynamic typing from those who
> practice it. I am sure it will be different for
> different people.

I am also very interested to hear people's opinions on this.

My answer isn't exactly what you were asking for. I use both dynamically
typed (Mathematica) and statically typed (OCaml) languages. However, I use
dynamically typed languages when the benefits of their capabilities
outweighs the pain of dynamic typing. I would prefer it if the dynamically
typed languages were more statically typed.

I've recently been playing with MetaOCaml. This is a fascinating project
that allows you to defer computations (like Lisp, but faster, shorter,
statically type checked and more elegant ;-). My latest attempts indicate
that static typing can be a hindrance here. Specifically, I can't write a
staged interpreter without having the generated code box most values.
Perhaps this is an example of a place where we'd be better off without
static typing. I'm not sure though, perhaps the solution is no typing
rather than dynamic typing.

Another possibility is GUI code. This can be very dynamic in nature and is
potentially the remit of dynamic typing. However, despite this being a
pedagogical example of OOP, I've found variants and HOFs to work just fine.

So I'm also still looking for a killer example of the utility of dynamic
typing...

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com

Ben Rudiak-Gould

unread,
Sep 25, 2005, 9:07:12 PM9/25/05
to
Marshall Spight wrote:
> I am curious as to what people see as the primary benefits
> of dynamic typing.

I rarely use dynamically typed languages, but...

I think that the major advantage of dynamic typing is that it's much easier
to design a good dynamically typed language than a good statically typed
language. When your language is statically typed, and especially when it's
Hindley-Milner, new language features often interact in tricky ways with the
type system, and therefore with each other. It's a difficult theoretical
problem to figure out how to add objects, runtime reflection,
metaprogramming, dynamic linking, on-line code replacement, etc. to a
statically typed language. In a dynamically typed language, you can usually
just add things in the most straightforward way, and it'll work as long as
the programmer follows some straightforward rules. And you don't have to
worry about your extension precluding some other useful extension in the future.

> Also, if one has any references to good books or papers on
> the topic, I'd be appreciative. I have ten books on my shelf
> about various topics in type theory; they are all of them
> written from a static-typing perspective.

I don't think there is a type theory of dynamically typed languages. The
term "dynamically typed" is really a misnomer; these languages don't have
types at all, just values. Sure, the + function will check at runtime that
its arguments are numbers, but that's not really the same thing. This kind
of domain check is much closer to / checking at runtime that the divisor is
not zero, or car/head checking that the argument is not the empty list.
People sometimes say that "static typing means no type errors at runtime",
but I would say the same of dynamic typing. Both kinds of languages have
value errors at runtime.

There's plenty of work on "value theories" of dynamically typed languages --
for example, Church's original work on the untyped lambda calculus.

-- Ben

Joachim Durchholz

unread,
Sep 26, 2005, 6:50:22 AM9/26/05
to
Ben Rudiak-Gould schrieb:

> I think that the major advantage of dynamic typing is that it's much
> easier to design a good dynamically typed language than a good
> statically typed language. When your language is statically typed, and
> especially when it's Hindley-Milner, new language features often
> interact in tricky ways with the type system, and therefore with each
> other. It's a difficult theoretical problem to figure out how to add
> objects, runtime reflection, metaprogramming, dynamic linking, on-line
> code replacement, etc. to a statically typed language.

Agreed.

> In a dynamically typed language, you can usually just add things in
> the most straightforward way, and it'll work as long as the
> programmer follows some straightforward rules. And you don't have to
> worry about your extension precluding some other useful extension in
> the future.

Um... I think the trade-off is somewhat different.

A static checker buys you a guarantee that there will be no unexpected
interactions between the various language features.

In a system without a checker, you're basically on your own.
In other words, things will mostly work, but the edge cases may come
back and bite you. Usually they won't (they are edge cases after all)...
though I suspect that they become more of a problem if systems get large
(i.e. more opportunities for an edge case to arise) or with
inexperienced programmers.
You said programmers would have to "follow some straightforward rules",
but I don't think the rules are straightforward - they are fuzzy,
defined by "what works", so you have to try whenever you approach one of
the limits (and you never know whether it will break tomorrow when you
add something else that stretches the limits).

Of course, these things are more important when you are like me - I
*love* stretching limits :-) (and maybe that's the reason why I find
dynamic systems unacceptable - this may also explain why the static vs.
dynamic typing is such a controversial issue: if it's a question of
personal workstyle and personality, personal arguments prevail over
technical ones.)

Regards,
Jo

James

unread,
Sep 26, 2005, 10:06:11 AM9/26/05
to
Ben Rudiak-Gould wrote:

>It's a difficult theoretical problem to figure out how to add objects, runtime
>reflection, metaprogramming, dynamic linking, on-line code replacement, etc.
>to a statically typed language.

I think you've generally hit upon the answer, but I'd rephrase your
"it's much easier to design a good dynamically typed language" as
"dynamic typing is the simpler solution."

So many arguments about functional languages come down to "well, that
would be possible given a sufficiently smart compiler" (see
http://c2.com./cgi/wiki?SufficientlySmartCompiler) or a sufficiently
smart type system or whatnot. Many times the off-the-cuff response is
along the lines of "Oh, you can do that in [fill-in name of defunct
research project language here]." Dynamic typing is a one-shot system
that deals with on-the-fly code reloading, integers that overflow in
all cases (not only when you use a special subtype of integers),
message passing between threads, etc. Yes you *can* do all this with
the proper type system, but it's not like there are many such languages
to choose from. The choice comes down to either "solving" these
problems today or putting off coding until an ideal type system is
developed.

And while I'm being controversial in my first c.l.f post in years:
Nobody--including me--is going to say that static typing is useless,
but I'd say that it tends to distract many great minds from more
important problems. Obviously static typing isn't a a primary
attribute of usefulness, otherwise we wouldn't have Python and Perl and
Erlang and Lisp and Lua. So you have people who are spending all their
time doing research into type systems, when working programmers have
found that you get 90% of the way there by writing clean, concise code
and creating a test suite as you go. You still need the test suite in
statically typed languages, of course, because you can't tell if you
passed in the parameters to make_rect(x,y,w,h) in the right order (or
was it make_rect(x1,y1,x2,y2) or was it make_rect(x,w,y,h)?). It's
also easy enough to write a tool to help verify that functions are
being called in consistent ways, like dialyzer for Erlang.

Further, I'd say that the correct way to create many applications isn't
to write them in pure OCaml or Erlang or whatever, but to create a
language that encapsulates exactly what you need. Then you can check
whatever properties of your program that you like. You don't have to
rely on the type system of the underlying language. Use a dynamically
typed language for flexibility, but maybe the resulting language has
some statically typed elements.

Message has been deleted

Aaron Gray

unread,
Sep 26, 2005, 3:41:35 PM9/26/05
to
> I am curious as to what people see as the primary benefits
> of dynamic typing.

Less physical typing at the keyboard :)

Typically there are no braces or begin...end's to type. The syntax is
henerally more simplistic than statically typed languages.

Typically the development cycle is quicker.

...

Aaron


Joachim Durchholz

unread,
Sep 26, 2005, 4:03:02 PM9/26/05
to
Aaron Gray schrieb:

>> I am curious as to what people see as the primary benefits of
>> dynamic typing.
>
> Less physical typing at the keyboard :)
>
> Typically there are no braces or begin...end's to type.

Lisp: (...)
Perl: {...}

Don't think that that's the real difference - I'd be hard-pressed to
name a language that has no block structure markers. (Well, some *very*
old Basics come to my mind, along with RPG IV and other nightmares...
but I'm pretty sure you didn't mean *these* *ggg*.)

> The syntax is henerally more simplistic than statically typed
> languages.

Agreed with that one - anybody got a clue why that's so?
Technical reasons? Reasons in the language designers' mentalities?

Regards,
Jo

Florian Weimer

unread,
Sep 26, 2005, 4:43:39 PM9/26/05
to
* Joachim Durchholz:

>> The syntax is henerally more simplistic than statically typed
>> languages.
>
> Agreed with that one - anybody got a clue why that's so?

Typ declarations are quite complex ("access not null function return
access not null Integer") and a source of all kinds of oddities.

In addition, there seems to be a tendency to avoid ambiguities through
special syntax, for example "." and "->" in C, or, in C++, "<" and ">"
as template argument list deliminators and the "typename" keyword.

Joachim Durchholz

unread,
Sep 26, 2005, 5:46:40 PM9/26/05
to
Florian Weimer schrieb:

Um, C++ isn't a particularly good example. It goes without saying that
you need additional syntax to express types, I'm more interested why
statically typed languages tend to have a more complex syntax even in
areas where types aren't written down.
Think an FPL with type inference: you don't write down many types in
these, yet their syntax tends to be somewhat more complex than their
dynamically-typed equivalents. I.e. I'm comparing Haskell with
Lisp/Smalltalk and wondering why Haskell still needs somewhat more
syntactic sugar.

Of course, on the complex side of the spectrum, you find languages of
both kinds (e.g. Perl for dynamic typing, C++ for static typing). In
fact I'm not totally sure that the correlation of "dynamic typing means
less syntax" holds in general (and establishing a correlation of any
kind would be a dubious undertaking anyway).
But I'm still interested in "how lean can we get". (With the obvious
exception that a statically typed language needs a syntax for type
expressions.)

Regards,
Jo

Ulf Wiger

unread,
Sep 27, 2005, 6:18:32 AM9/27/05
to
Ben Rudiak-Gould <br276d...@cam.ac.uk> writes:

> Marshall Spight wrote:
> > I am curious as to what people see as the primary benefits
> > of dynamic typing.

In part, the same as with type inference.

Add to that the rather widespread notion that it is
often easier to write a program in a dynamically typed
language that is _good enough_ (if not provably correct)
even though it wouldn't necessarily pass a type check.

Not all people agree that this is true, but I think this
is similar to the situation that some proponents of static
typing prefer to declare types up-front, while others
prefer the compiler to infer types automatically.



> I think that the major advantage of dynamic typing is that
> it's much easier to design a good dynamically typed language
> than a good statically typed language. When your language is
> statically typed, and especially when it's Hindley-Milner,
> new language features often interact in tricky ways with the type
> system, and therefore with each other.

I think there might be some truth to this.

In industry, most programmers are infidels, at least as
regards types. If typing helps, fine. If it gets in the
way, then it's out of there. If typing would save you
hundreds of hours down the road for the price of a few
hours up front, well, that's too bad. Industrial projects
don't necessarily work like that. Extra work up-front in
order to possibly gain advantage later doesn't always win,
partly because if you don't win the battle for market share,
there may not be a 'later'.

> It's a difficult theoretical problem to figure out how to add
> objects, runtime reflection, metaprogramming, dynamic linking,
> on-line code replacement, etc. to a statically typed language.
> In a dynamically typed language, you can usually just add things
> in the most straightforward way, and it'll work as long as the
> programmer follows some straightforward rules. And you don't
> have to worry about your extension precluding some other
> useful extension in the future.

Another, perhaps a bit provocative, view on this is that since
it's a more difficult theoretical problem, it is more academically
interesting. Thus, if you're going for a PhD in programming
language theory, obviously statically typed languages offer
more fertile ground for research. Many of the challenges in
commercial product development are, at least academically,
non-problems, in that there are known techniques to deal with
them. The only problem for industry is that these techniques
seem (note the wording) out of reach for practical purposes.


> I don't think there is a type theory of dynamically typed
> languages. The term "dynamically typed" is really a misnomer; these
> languages don't have types at all, just values.

I don't think this describes a dynamically typed language like
Erlang, for example. After all, what's the difference between
the two patterns [1,2,3] and {1,2,3}? The first is a list with
the values 1,2 and 3, and the second is a tuple with the values
1,2 and 3. Isn't it reasonable to say that the main difference
is that of type, not values? The Erlang compiler is in many
cases good enough to detect type discrepancies and refuse
to compile, but this would be in violation with the language
specification, since:

{1,2,3} = [1,2,3]

is perfectly valid in Erlang and _will_ lead to a runtime
exception if evaluated. It's not that this can't be detected
at compile-time; it's just "illegal" to make it a compile
error. In this explicit case, I can't come up with an argument
for why the compiler shouldn't reject the program, except
perhaps consistency. Having said this, the Erlang compiler
_will_ reject a program that uses record notation and refers
to an attribute not present in the record declaration.

So basically there are weakly typed, or untyped dynamically
typed languages, and there are strongly typed dynamically
typed languages. Erlang is strongly typed, and IMO slowly
moving towards being optionally statically typed.

Using Dialyzer, you can get quite a lot of type warnings
at compile-time (but only warnings, see above). For example,
Dialyzer will tell you the expression above will crash, and
since it does global analysis with type propagation, it
will of course find much more than that.

At the ICFP Erlang workshop last weekend, there was a
discussion about adding type annotations to erlang.
This has been suggested before on the erlang mailing
list, and most people welcome it. One approach might be
that if you have added type annotations, it will be
a compilation error if the type annotation and the
actual code are in conflict. The TypEr tool, also
presented at the workshop, will automatically annotate
the code for you, if you want.

Does that make Erlang statically or dynamically typed?

/Uffe
--
Ulf Wiger, Senior Specialist,
/ / / Architecture & Design of Carrier-Class Software
/ / / Team Leader, Software Characteristics
/ / / Ericsson AB, IMS Gateways

Bruce Stephens

unread,
Sep 28, 2005, 6:23:51 AM9/28/05
to
Joachim Durchholz <j...@durchholz.org> writes:

[...]

> I.e. I'm comparing Haskell with Lisp/Smalltalk and wondering why
> Haskell still needs somewhat more syntactic sugar.

I doubt that there's a need for more syntax. I imagine one can
construct an AST for any Haskell program, and presumably that AST
could be represented in sexps.

That kind of thing suggested (to me, anyway) a possibly interesting
student-type project.

There's been some suggestion that it might be interesting to have a
statically typed lisp (such things exist, of course), in that that
would fit better in CLR, for example (with its other statically typed
languages like C#).

Anyway, a similar idea would be to develop a lisp syntax for (say)
ocaml. OCaml has mostly implicit typing, and its environment has a
fairly rich set of modules and things, and it's not pure, which seems
likely to fit better than a pure semantic with lisp.

(Most of the statically typed lisps have been attempts at statically
typed variants of existing lisps, I think. But maybe that's not the
best way to approach things: maybe it would be better to go at a
target which has lots of good things already, and see if the lisp
approach to syntax brings something extra.)

[...]

Panu Kalliokoski

unread,
Sep 28, 2005, 6:43:19 AM9/28/05
to
In <87hdc5s...@cenderis.demon.co.uk> Bruce Stephens <bruce+...@cenderis.demon.co.uk> writes:
>There's been some suggestion that it might be interesting to have a
>statically typed lisp (such things exist, of course), in that that
>would fit better in CLR, for example (with its other statically typed
>languages like C#).

One of probably the earliest approaches to this was the design of
pre-scheme, which was used to write an interpreter for Scheme. The idea
was (IIUC) to write the interpreter of Scheme in an easy-to-compile
language that is a strict subset of Scheme. This way, the interpreter
can run itself even if the compiled code is reasonably easy to produce.

Panu
--
personal contact: ate...@iki.fi, +35841 5323835, +3589 85619369
work contact: panu.kal...@helsinki.fi, +35850 3678003
kotisivu (henkkoht): http://www.iki.fi/atehwa/
homepage (technical): http://sange.fi/~atehwa/

Jon Harrop

unread,
Sep 28, 2005, 12:40:23 PM9/28/05
to
Joachim Durchholz wrote:
> Aaron Gray schrieb:

>> The syntax is henerally more simplistic than statically typed
>> languages.
>
> Agreed with that one - anybody got a clue why that's so?

I don't think it is true. Many dynamically typed languages (e.g. Python,
Perl, Tcl, Ruby, PHP, BASIC, Kogut) have syntaxes that are at least as
complex as simple MLs.

Lisp and Mathematica are dynamically typed. Both provide "code as data",
which leads to a simple underlying syntax (an "s-expr" in Lisp or a
"FullForm expr" in Mathematica). However, even here, Mathematica goes to
great lengths to parse, pretty print and typeset its exprs so most users
are never exposed to this internal representation.

So I think the assertion is only true if it is changed to "Lisp's syntax is
generally more simplistic than statically typed languages". Then the answer
is "Lisp was designed to have a simple syntax".

> Technical reasons? Reasons in the language designers' mentalities?

In Lisp's case, I think it is nothing more than religion. Mathematica has
shown that the same functionality can be provided without having to force
users to read and write raw ASTs.

Joe Hendrix

unread,
Sep 28, 2005, 3:09:02 PM9/28/05
to
Joachim Durchholz wrote:
>> The syntax is henerally more simplistic than statically typed
>> languages.
>
> Agreed with that one - anybody got a clue why that's so?
> Technical reasons? Reasons in the language designers' mentalities?

The fundamental reason is that a statically typed language needs
syntax for describing types. Type inference can alleviate some of
this, but not all.

The fundamentals however are obscured by individual preferences with
regard to syntax. For example, one could undoubtedly come up with
some type of many-sorted version of Lisp that would have types, but
the syntax would be much simpler than almost every dynamically typed
language. I don't necessarily think that is a good idea, but it is
possible.

Bruce Stephens

unread,
Sep 28, 2005, 5:06:43 PM9/28/05
to
Panu Kalliokoski <ate...@sange.fi> writes:

> In <87hdc5s...@cenderis.demon.co.uk> Bruce Stephens <bruce+...@cenderis.demon.co.uk> writes:
>>There's been some suggestion that it might be interesting to have a
>>statically typed lisp (such things exist, of course), in that that
>>would fit better in CLR, for example (with its other statically typed
>>languages like C#).
>
> One of probably the earliest approaches to this was the design of
> pre-scheme, which was used to write an interpreter for Scheme. The
> idea was (IIUC) to write the interpreter of Scheme in an
> easy-to-compile language that is a strict subset of Scheme. This
> way, the interpreter can run itself even if the compiled code is
> reasonably easy to produce.

Yeah, I guess. I was more thinking of languages intended for general
purpose use (rather than for bootstrapping the real language). I'm
fairly sure I've seen statically typed lisp variants aimed at C/C++
and JVM recently, for example, although I can't seem to find them now.
(Not that they're recent; it's just that I'd never seen them before.)

Not that that matters; the problem I felt with them is that they were
trying to fit cleanly into an impoverished environment (so in order to
fit easily, they chose to give up closures, for example).

It seemed to me that if you chose something like OCaml, which (I
think, anyway) has closures and neat stuff like that, then you ought
to end up with something that's nice and high-level, and which is
probably as close to a lisp as is possible, but which has static
typing.

Maybe that turns out not to be interesting, though. Maybe the macros
that lisp's syntax makes convenient aren't so useful in a strongly
typed language (or maybe they're not useful in some strongly typed
languages that people have tried them in).

And maybe the syntactic simplicity doesn't actually help in learning
or remembering, because the things that the syntax expresses has to be
there somewhere, and it's just somewhere else.

George Neuner

unread,
Sep 29, 2005, 5:44:38 AM9/29/05
to
On Wed, 28 Sep 2005 17:40:23 +0100, Jon Harrop <use...@jdh30.plus.com>
wrote:

>Joachim Durchholz wrote:
>> Aaron Gray schrieb:
>>> The syntax is henerally more simplistic than statically typed
>>> languages.
>>

>In Lisp's case, I think it is nothing more than religion.

More an accident of history.

From the beginning, McCarthy intended Lisp to have an infix syntax (I
think he called it M-exprs) and to use S-exprs only as an internal
representation. The infix syntax never materialized because people
realized that writing code directly in the AST form had advantages -
such as you could easily write code that writes code - important in AI
which was Lisp's original target area.

Remember also that Lisp was designed at a time when a computer with
32K words of core memory was BIG. Lisp was interactive and the extra
cost of having an integrated infix reader on top of the S-expr reader
(which was still needed) was judged too much. Eliminating it made
room for other more meaningful features.

After a while everyone became used to writing in the AST form. By the
time Lisp was ported to more capable machines, most everyone had
forgotten about the original plan for infix syntax. Anyone who really
wanted a different programming syntax could implement it using macros.

George
--
for email reply remove "/" from address

David Hopwood

unread,
Sep 29, 2005, 11:24:06 AM9/29/05
to
George Neuner wrote:
> Jon Harrop <use...@jdh30.plus.com> wrote:
>>Joachim Durchholz wrote:
>>>Aaron Gray schrieb:
>>>
>>>>The syntax is henerally more simplistic than statically typed
>>>>languages.
>
>>In Lisp's case, I think it is nothing more than religion.
>
> More an accident of history.

It is an accident of history that has become a religion (maybe that's
true of religions in general ;-)

For true believers of this religion, the point that there are ways
to do metaprogramming just as easily without having the source code
correspond as literally to an AST as s-expressions do, seems to fall
on deaf ears.

> Anyone who really wanted a different programming syntax could implement
> it using macros.

This is another of the religion's peculiar dogmas. The fact is that
people who are turned off by s-expression syntax don't spend their time
writing macro packages or parsers to translate other syntaxes into Lisp;
they just don't use Lisp. This is probably their loss, but it's still
true.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Marcin 'Qrczak' Kowalczyk

unread,
Sep 29, 2005, 3:16:15 PM9/29/05
to
"James" <james...@gmail.com> writes:

>>It's a difficult theoretical problem to figure out how to add
>>objects, runtime reflection, metaprogramming, dynamic linking,
>>on-line code replacement, etc. to a statically typed language.
>
> I think you've generally hit upon the answer, but I'd rephrase your
> "it's much easier to design a good dynamically typed language" as
> "dynamic typing is the simpler solution."

I agree, this is what I would answer too. Dynamic typing allows
a simpler language with the same expressivity.

For example ML has structures (modules) and functors as a separate
concept from records and functions only because of static typing.
They need to trade some properties for others to keep typing decidable.
They may contain public components of a type which is different in
different structures matching the same signature, but they are not
first-class.

* * *

There are two major families of static type systems:

1. Typical for functional languages (Haskell, Clean, SML, OCaml,
Mercury): type inference for global and local variables (but not
for object fields), no subtyping (or only limited subtyping with
explicit subsumption as in OCaml), easy parametric polymorphism,
no runtime type information.

2. Typical for OO languages (Java, C#, Eiffel): subtyping (with
subtype relationships explicitly declared at the point of the
definition of the subtype), explicitly specified type for most
variables, runtime type information, downcasting (needs RTTI),
parametric polymorphism added late.

(There is also a third family of poorly expressive type systems,
as in Pascal, C and C++, which I'm not considering here.)

The "functional" family emphasizes static guarantees provided by the
type system, and it grows features which allow to accurately describe
more advanced types. I'm seeing that in Haskell and OCaml.

The "OO" family resorts to dynamic typing (RTTI and downcasting) when
it's not able to express some types statically (e.g. for dispatching
exceptions, and for the second parameter of binary methods).

While proponents of functional programming claim that type systems
which allow to describe more scenarios without resorting to runtime
checking are superior, the "OO" family has an advantage in being able
to emulate dynamic typing when needed.

Consider the issue of exceptions. It's no problem for dynamically
typed languages nor for OO languages to check the type of a given
exception object at runtime (I'm ignoring the issue of checked
exceptions in Java). SML and OCaml have a separate language feature
for introducing new kinds of exceptions, which is mostly adequate,
although it doesn't allow to conveniently group exception types
in a hierarchy. But Haskell doesn't provide a good way of having an
extensible exception type. It has a regular algebraic type for builtin
I/O errors, and Glasgow Haskell allows to use a kludge of the Dynamic
type for exceptions (again without a hierarchy).

The issue which pushed me towards dynamic typing was making an
extensible abstract syntax tree. Again it's easy with dynamic typing
and with OO-style static typing. I had to do that in Haskell, and
since the AST types to be extended were a regular algebraic type,
I had to copy & paste them for extending. It seems there is no good
mechanism in Haskell to make such type extensible. Haskell and OCaml
have the most impressive type systems I've seen, and it's a pity that
they both failed this case.

I already knew about limitations of OO-style static typing, e.g. that
it doesn't provide an equivalent of Haskell / Clean / Mercury classes
(which, unlike OO-style interfaces, can have instances defined outside
the definition of the corresponding type). So dynamic typing was the
natural choice.

I appreciate and miss advantages of static typing, especially pointing
out many programmer errors at compile time. But I couldn't imagine a
sufficiently powerful static type system.

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

Joachim Durchholz

unread,
Sep 29, 2005, 4:52:36 PM9/29/05
to
Marcin 'Qrczak' Kowalczyk schrieb:

> While proponents of functional programming claim that type systems
> which allow to describe more scenarios without resorting to runtime
> checking are superior, the "OO" family has an advantage in being able
> to emulate dynamic typing when needed.

Agreed.

> Consider the issue of exceptions. It's no problem for dynamically
> typed languages nor for OO languages to check the type of a given
> exception object at runtime (I'm ignoring the issue of checked
> exceptions in Java).

Hmm... from the an-exception-is-abortion perspective, you don't need to
map different kinds of errors to different exception types. If all that
an exception does is to abort a computation, the program doesn't care
much about data that the failed computation may hold inside. (The
/programmer/ would care, but the exception objects can hold that data in
string form, so there's no need of a multitude of types to get that to
work.)

At least that's my experience with exceptions-are-abortions regimes.
If found them to work quite well indeed, and usually better than
exceptions-are-shortcircuited-computations.
YMMV :-)

> SML and OCaml have a separate language feature for introducing new
> kinds of exceptions, which is mostly adequate, although it doesn't
> allow to conveniently group exception types in a hierarchy.

From the perspective shown above, this is a non-issue. If an exception
is a failed computation, the code that catches the exception isn't very
interested in the kind of failure. It needs a lot of information in
string form so that it can log a meaningful error message, but the exact
cause of the failure isn't much of the caller's concern.
In fact, the only strategies that it has are:
* Retry (if real-world objects with temporary glitches are involved)
* Try a different processing strategy, i.e.
* try a different algorithm/configuration/whatever
* degrade gracefully (if its own contract allows that!)
* Pass the bucket up by aborting with an exception


IOW I think the example of exception types isn't a particularly lucky one.
I can't tell about the other examples you're bringing up :-)

Regards,
Jo

Vincenzo Ciancia

unread,
Sep 29, 2005, 11:09:11 PM9/29/05
to
Marcin 'Qrczak' Kowalczyk wrote:

> The issue which pushed me towards dynamic typing was making an
> extensible abstract syntax tree.

> [cut]

> Haskell and OCaml
> have the most impressive type systems I've seen, and it's a pity that
> they both failed this case.

Maybe I am not following you, but I think that polymorphic variants should
be the right thing for an extensible abstract syntax tree.

V.

--
Please note that I do not read the e-mail address used in the from field but
I read vincenzo_ml at yahoo dot it
Attenzione: non leggo l'indirizzo di posta usato nel campo from, ma leggo
vincenzo_ml at yahoo dot it

Jon Harrop

unread,
Sep 29, 2005, 5:42:48 PM9/29/05
to
David Hopwood wrote:
> This is another of the religion's peculiar dogmas. The fact is that
> people who are turned off by s-expression syntax don't spend their time
> writing macro packages or parsers to translate other syntaxes into Lisp;
> they just don't use Lisp. This is probably their loss, but it's still
> true.

Apart from the "their loss", I agree. ;-)

Paul Rubin

unread,
Sep 29, 2005, 5:55:03 PM9/29/05
to
David Hopwood <david.nosp...@blueyonder.co.uk> writes:
> This is another of the religion's peculiar dogmas. The fact is that
> people who are turned off by s-expression syntax don't spend their time
> writing macro packages or parsers to translate other syntaxes into Lisp;
> they just don't use Lisp. This is probably their loss, but it's still true.

Dylan?

Jon Harrop

unread,
Sep 29, 2005, 5:57:14 PM9/29/05
to
Marcin 'Qrczak' Kowalczyk wrote:
> Dynamic typing allows a simpler language with the same expressivity.

I know this isn't what you mean but you cannot express static checks in a
dynamically typed language. I find that to be very useful in practice.

> SML and OCaml have a separate language feature
> for introducing new kinds of exceptions, which is mostly adequate,
> although it doesn't allow to conveniently group exception types
> in a hierarchy.

Modules act as hierarchical namespaces for exceptions, at least.

> The issue which pushed me towards dynamic typing was making an
> extensible abstract syntax tree. Again it's easy with dynamic typing
> and with OO-style static typing. I had to do that in Haskell, and
> since the AST types to be extended were a regular algebraic type,
> I had to copy & paste them for extending. It seems there is no good
> mechanism in Haskell to make such type extensible. Haskell and OCaml
> have the most impressive type systems I've seen, and it's a pity that
> they both failed this case.

Have you read Jacques Garrigue's paper on code reuse through polymorphic
variants? He demonstrates an extensible AST in OCaml.

> I appreciate and miss advantages of static typing, especially pointing
> out many programmer errors at compile time. But I couldn't imagine a
> sufficiently powerful static type system.

It seems odd to ditch static typing when dynamic typing is just a special
case. Did you consider providing an alternate syntax for dynamic typing
within a statically typed language?

Joe Hendrix

unread,
Sep 29, 2005, 6:34:44 PM9/29/05
to
Marcin 'Qrczak' Kowalczyk wrote:
> There are two major families of static type systems:
>
> 1. Typical for functional languages (Haskell, Clean, SML, OCaml,
> Mercury): type inference for global and local variables (but not
> for object fields), no subtyping (or only limited subtyping with
> explicit subsumption as in OCaml), easy parametric polymorphism,
> no runtime type information.
>
> 2. Typical for OO languages (Java, C#, Eiffel): subtyping (with
> subtype relationships explicitly declared at the point of the
> definition of the subtype), explicitly specified type for most
> variables, runtime type information, downcasting (needs RTTI),
> parametric polymorphism added late.

In terms of common "major" type systems this is correct. Subtypes are
used extensively in "functional" programming languages based on term
rewriting such as Obj and Maude. Maude also has metaprogramming, though
it is type checked at runtime.

There are also higher order type systems that show up in proof
assistents like Coq. I don't know of any major languages that use them.

> (There is also a third family of poorly expressive type systems,
> as in Pascal, C and C++, which I'm not considering here.)

Lumping C++ in with Pascal and C seems quite wrong. I suspect that you
have not used the C++ STL extensively much less done any template meta
programming. C++ has a significantly more expressive type system than
Java or C#. C++ templates can avoid the problems with multimethods that
OO type systems face, and is quite expressive. Through meta-programming
it is possible to make type lists and functions that operate on types.
The consequence is that the type system is not decidable, but I haven't
found that to be much of a problem in practice so far. The horrible
syntax one needs to use seems to be a bigger issue.

I don't personally have significant enough experience to compare the
type system in C++ with those of Haskell or ML.

Neelakantan Krishnaswami

unread,
Sep 29, 2005, 7:29:38 PM9/29/05
to
In article <871x37u...@qrnik.zagroda>, Marcin 'Qrczak' Kowalczyk wrote:
>
> The issue which pushed me towards dynamic typing was making an
> extensible abstract syntax tree. Again it's easy with dynamic typing
> and with OO-style static typing. I had to do that in Haskell, and
> since the AST types to be extended were a regular algebraic type,
>
> I had to copy & paste them for extending. It seems there is no good
> mechanism in Haskell to make such type extensible. Haskell and OCaml
> have the most impressive type systems I've seen, and it's a pity that
> they both failed this case.

To do this, you need to leave the recursive knot untied. For example,
if you want to do unification, for example, you may want an extra case
for unification variables.

First, define a functor to tie the knot explicitly.

module Fix(F : sig type 'a t end) =
struct
type t = [ `Fix of t F.t ]
end

Then you can do something like:

module Expr =
struct
type 'a t = [ `Node of string * 'a list ]
end

module Extend =
struct
type 'a t = [ 'a Expr.t
| `Evar of string ]
end

module Base = Fix(Expr)
module Unif = Fix(Extend)

Now every term of type Base.t will also be an element of Unif.t.

--
Neel Krishnaswami
ne...@cs.cmu.edu

Marcin 'Qrczak' Kowalczyk

unread,
Sep 29, 2005, 8:00:26 PM9/29/05
to
Joachim Durchholz <j...@durchholz.org> writes:

> From the perspective shown above, this is a non-issue. If an
> exception is a failed computation, the code that catches the
> exception isn't very interested in the kind of failure. It needs a
> lot of information in string form so that it can log a meaningful
> error message, but the exact cause of the failure isn't much of the
> caller's concern.

Most languages support distinguishing the kind of failures, which is
used for other things besides printing it to the user and exiting,
and I think it's useful enough to be worth supporting.

I/O errors are usually handled differently than program bugs, but the
mechanism of exceptions is suitable for both.

There is an analogous issue of asynchronous signals (Unix signals,
memory overflow, internal timeouts, asynchronous communucation between
threads). Their kinds must be distinguished if they are supported, the
set of kinds of signals is open. They also participate in the set of
exceptions because many of them are turned into exceptions by default.

Exceptions are suitable for expressing non-local exits.

Python used to have string-only exceptions. It was later changed to
support arbitrary objects, and their dynamic type is used to distinguish
the reasons.


Jon Harrop <use...@jdh30.plus.com> writes:

>> SML and OCaml have a separate language feature for introducing new
>> kinds of exceptions, which is mostly adequate, although it doesn't
>> allow to conveniently group exception types in a hierarchy.
>

> Modules act as hierarchical namespaces for exceptions, at least.

I meant grouping for the purpose of catching any exception from the
group.

> Have you read Jacques Garrigue's paper on code reuse through
> polymorphic variants? He demonstrates an extensible AST in OCaml.

Ok, maybe it would work. I more or less know how they work in theory,
although I haven't used them in practice.

> It seems odd to ditch static typing when dynamic typing is just a
> special case.

I don't understand. There are two families of dynamic typing:
1. the set of types is closed (Scheme R5RS, Erlang, Prolog, Logo)
2. the set of types is open (most dynamically typed languages)
and the two families of statically typed languages I mentioned are
distinguished by an analogous property: whether the set of "cases"
or "subtypes" of a type is closed or open.

Emulation becomes hard when a statically typed language with closed
sets of cases (the "functional" family) wants to embed a dynamically
typed language with open set of types. It's easy only in case of types
defined in the language itself, and for a fixed number of primitive
types. It breaks when extending the embedded language with types
defined in various libraries of the host language.

I did it once in OCaml by using a custom RTTI based on Obj.magic. This
is cheating because it's unsafe. There might be other, safer but still
ugly workarounds:
- using the exn type
- registering pairs of functions which update a global reference

> Did you consider providing an alternate syntax for dynamic typing
> within a statically typed language?

It can't be fixed on the syntactic level if the semantics doesn't
include the possibility of grouping an open set of types in a given
supertype.


Joe Hendrix <jhen...@uiuc.edu> writes:

> Lumping C++ in with Pascal and C seems quite wrong.

Ok, but it's a different dimension which is orthogonal to embedding
dynamic typing in a formally statically typed program.

By the way, I would say that C++ templates use a compile-time dynamic
type system, and perhaps this makes them powerful. Yes, it's all
ultimately checked before the program is run, but interfaces of
templates themselves aren't described by the language. Templates
generally fail with horrible error messages coming from their
internals if they are applied to a type with wrong properties,
just like when a function in a dynamically typed language is misused.
This is dynamic typing lifted to the metalevel.

OTOH Haskell, C# and Java have features with applications which
overlap with C++ templates, but they are fully statically typed,
i.e. interfaces of abstractions they introduce are described more
abstractly than by requirements deduced from their implementation.

It seems that there is a larger set of problems that templates are
suitable for but these features aren't, than the converse.

David Hopwood

unread,
Sep 29, 2005, 8:14:59 PM9/29/05
to

Dylan is the exception that proves the rule.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Jon Harrop

unread,
Sep 29, 2005, 10:01:22 PM9/29/05
to
Marcin 'Qrczak' Kowalczyk wrote:
> Jon Harrop <use...@jdh30.plus.com> writes:
>>> SML and OCaml have a separate language feature for introducing new
>>> kinds of exceptions, which is mostly adequate, although it doesn't
>>> allow to conveniently group exception types in a hierarchy.
>>
>> Modules act as hierarchical namespaces for exceptions, at least.
>
> I meant grouping for the purpose of catching any exception from the
> group.

Ah yes. Would it not be easy to implement catching any exception defined
within a module as an extension to ML?

>> Have you read Jacques Garrigue's paper on code reuse through
>> polymorphic variants? He demonstrates an extensible AST in OCaml.
>
> Ok, maybe it would work. I more or less know how they work in theory,
> although I haven't used them in practice.

Yes, I've played with them because I'm interested in extensible scene trees
for graphics. Basically the same problem.

>> It seems odd to ditch static typing when dynamic typing is just a
>> special case.
>
> I don't understand. There are two families of dynamic typing:
> 1. the set of types is closed (Scheme R5RS, Erlang, Prolog, Logo)
> 2. the set of types is open (most dynamically typed languages)
> and the two families of statically typed languages I mentioned are
> distinguished by an analogous property: whether the set of "cases"
> or "subtypes" of a type is closed or open.

Are polymorphic variants closed or open?

> Emulation becomes hard when a statically typed language with closed
> sets of cases (the "functional" family) wants to embed a dynamically
> typed language with open set of types. It's easy only in case of types
> defined in the language itself, and for a fixed number of primitive
> types. It breaks when extending the embedded language with types
> defined in various libraries of the host language.

Are you saying that the representation of values and types in a Lisp
interpreter written in ML (for example) is "hard"?

> I did it once in OCaml by using a custom RTTI based on Obj.magic. This
> is cheating because it's unsafe. There might be other, safer but still
> ugly workarounds:
> - using the exn type
> - registering pairs of functions which update a global reference

Yes, that's pretty horrific. :-)

>> Did you consider providing an alternate syntax for dynamic typing
>> within a statically typed language?
>
> It can't be fixed on the syntactic level if the semantics doesn't
> include the possibility of grouping an open set of types in a given
> supertype.

I think it can be fixed on the syntactic level as long as you're willing to
go most of the way to writing the dynamic typing bits of an interpreter.
That's a lot of extra code, its ugly and the dynamic type checks are slow,
but how much can be factored out into a neat little syntax?

Jon Harrop

unread,
Sep 29, 2005, 10:07:22 PM9/29/05
to
Joe Hendrix wrote:
> There are also higher order type systems that show up in proof
> assistents like Coq. I don't know of any major languages that use them.

Any ideas why they aren't used in ordinary languages? I have found that I
need to be quite familiar with the type system in order to make good use of
ML. Perhaps their type systems are just too complicated for Jo-programmer.

> I don't personally have significant enough experience to compare the
> type system in C++ with those of Haskell or ML.

IMHO, practical issues make C++'s type system unusable, as you said. In
addition to the hideous syntax is appaulingly slow compile time. When I
last used C++ I was seeing compile times of 1-24hrs for <10kLOC projects.

Panu Kalliokoski

unread,
Sep 30, 2005, 12:26:31 AM9/30/05
to
The AST and exception examples bring up very good points about the
differences of static and dynamic typing, but I'm still wondering about
this:

In <871x37u...@qrnik.zagroda> Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> writes:
>1. Typical for functional languages (Haskell, Clean, SML, OCaml,
> Mercury): type inference for global and local variables (but not
> for object fields), no subtyping (or only limited subtyping with
> explicit subsumption as in OCaml), easy parametric polymorphism,
> no runtime type information.

I fail to see how OCaml's subtyping is "limited", and I've always deemed
OCaml the language where type subsumption (of classes) is _implicit_,
based on the subsumption relation of their type signatures.

Ocaml has these kinds of "must-support-at-least" and
"does-support-at-most" type signatures in at least polymorphic variants
and objects. Moreover, the exn type is extensible. (I think OCaml
would still benefit from a "dynamic" type.)

What OCaml lacks is RTTI. But as you admit, languages with RTTI are not
really statically typed at all, so they hardly qualify as a category of
statically typed languages. OCaml's FAQs urge you to use exceptions if
you need RTTI.

Joachim Durchholz

unread,
Sep 30, 2005, 4:10:49 AM9/30/05
to
Marcin 'Qrczak' Kowalczyk schrieb:

> Joachim Durchholz <j...@durchholz.org> writes:
>
>
>>From the perspective shown above, this is a non-issue. If an
>>exception is a failed computation, the code that catches the
>>exception isn't very interested in the kind of failure. It needs a
>>lot of information in string form so that it can log a meaningful
>>error message, but the exact cause of the failure isn't much of the
>>caller's concern.
>
> Most languages support distinguishing the kind of failures, which is
> used for other things besides printing it to the user and exiting,
> and I think it's useful enough to be worth supporting.

I disagree about that.

> I/O errors are usually handled differently than program bugs, but the
> mechanism of exceptions is suitable for both.

It's suitable, but is it necessary?

Regarding I/O errors, the system may assume that I/O errors "just don't
happen" (as is reasonable with today's harddisks), the functions'
interfaces might not provide for failures. I.e. if you read the first 20
bytes of a file, the caller is supposed to assume that those 20 bytes
are accessible. If the low-level function cannot read the data, this is
a failure (actually not a bug, but handled just like one): it cannot
fulfill its contract and must throw an exception. If the caller checks
for the exception, it can react to read errors (e.g. by retrying), if it
doesn't, the exception will
Alternatively, the software doesn't guarantee errors. The contract of a
Read function now says "either here are your bytes, or we have an error
and here is the error data".

> There is an analogous issue of asynchronous signals (Unix signals,
> memory overflow, internal timeouts, asynchronous communucation between
> threads). Their kinds must be distinguished if they are supported, the
> set of kinds of signals is open. They also participate in the set of
> exceptions because many of them are turned into exceptions by default.

I wouldn't use Unix signals as

> Exceptions are suitable for expressing non-local exits.

Yes, but non-local exits complicate issues. They closely couple
low-level and high-level activities.

One example: I'm dividing foo/bar and get a DivideByZero exception. The
knee-jerk reaction might be to assume that bar is zero and react
accordingly, but if foo is a function, the exception might have
originated there.

Of course, I could run each function separately and assign its results
to intermediates. But that deprives me of the possibility to rearrange
code at will.
Even if I were willing to take that trade-off, I'd be unable to properly
handle that DivideByZero exception. It might be a problem in foo with
known (or inferrable/diagnosable) causes, or it might be a problem
somewhere deep down the call stack. We could do some hefty analysis of
the call stack info that should be part of the exception, but then the
top-level function has to "know" a lot about the infrastructure that
it's calling, creating the above-mentioned tight coupling between high
and low software levels.

Isn't FPL programming about making dependencies explicit? Having the
low-level routines return sum types that either contain the desired
result or error information is just the right vehicle for this kind of
explicitness.
It even gives the caller the choice whether to handle an unusual status
as a bug or as something that needs to be handled. Handling as a bug
means: matching just for the normal variant; the language will throw an
exception. Handling it explicitly means matching for the normal case as
well as any other cases. We might even go a middle ground: for a file
I/O error, the code might choose to match on "medium removed" (this
could be a common occurrence with NFS files or removable media such as
CDs or USB sticks), but ignore "read error" or "end-of-file" (the first
being an external problem, the second probably a bug - but we don't
care, it's a failure outside the contract of the function, and all we
want is a useful error message).

> Python used to have string-only exceptions. It was later changed to
> support arbitrary objects, and their dynamic type is used to distinguish
> the reasons.

As I said: using dynamic types is just convenient in OO languages. You
could equally well use nested sum types.

Regards,
Jo

Marcin 'Qrczak' Kowalczyk

unread,
Sep 30, 2005, 4:29:15 AM9/30/05
to
Jon Harrop <use...@jdh30.plus.com> writes:

>> I meant grouping for the purpose of catching any exception from the
>> group.
>
> Ah yes. Would it not be easy to implement catching any exception
> defined within a module as an extension to ML?

Even if it was, it would not be sufficient, because it would not allow
to extend preexisting groups. For example you couldn't add more "logic
error" exceptions from outside the standard library.

>> I don't understand. There are two families of dynamic typing:
>> 1. the set of types is closed (Scheme R5RS, Erlang, Prolog, Logo)
>> 2. the set of types is open (most dynamically typed languages)
>> and the two families of statically typed languages I mentioned are
>> distinguished by an analogous property: whether the set of "cases"
>> or "subtypes" of a type is closed or open.
>
> Are polymorphic variants closed or open?

Closed. You can't emulate exn using polymorphic variants because
adding a new variant changes the type ("supertype").

Well, they are more open than regular algebraic types, because
extending them merges the new variants with old variants, and if
the code using the old variants was kept polymorphic, then it can
be reused. Perhaps it's worth distinguishing 3 levels of openness
in the statically typed case.

> Are you saying that the representation of values and types in a Lisp
> interpreter written in ML (for example) is "hard"?

Yes, when it includes the possibility to wrap new ML modules to make
their functionality available to the embedded Lisp without changing
the core of the interpreter (the representation of runtime values).

Marcin 'Qrczak' Kowalczyk

unread,
Sep 30, 2005, 4:43:37 AM9/30/05
to
Panu Kalliokoski <ate...@sange.fi> writes:

> I fail to see how OCaml's subtyping is "limited", and I've always
> deemed OCaml the language where type subsumption (of classes) is
> _implicit_, based on the subsumption relation of their type
> signatures.

Ok, perhaps I should have used a different adjective instead of
"limited".

I insist on explicitness. The subtyping relation is implicit,
but subsumption (upcasting) is explicit.

> What OCaml lacks is RTTI. But as you admit, languages with RTTI are
> not really statically typed at all, so they hardly qualify as a
> category of statically typed languages.

That was my point: languages with RTTI merge statically typed and
dynamically typed paradigms, which is their advantage, while those
without are limited to pure static typing.

exn in OCaml indeed is extensible as far as adding new cases and
recognizing a given case is concerned. There are related operations
however that both dynamically typed languages and those statically
typed with RTTI can do, while OCaml can't for exn:

- Making an operation declared for an open family for cases, which
is then defined incrementally for each case, and dispatched when
applied.

Well, it could be done by making a list of functions which check
for the exn case they handle and pass others further. It's quite
inefficient with this linear search but would work.

- Extracting the type to use it as a dictionary key (this could be
used to implement dispatch).

Things are not that black & white.

Marcin 'Qrczak' Kowalczyk

unread,
Sep 30, 2005, 7:34:29 AM9/30/05
to
Joachim Durchholz <j...@durchholz.org> writes:

>> Most languages support distinguishing the kind of failures, which is
>> used for other things besides printing it to the user and exiting,
>> and I think it's useful enough to be worth supporting.
>
> I disagree about that.

In my compiler (13k lines) there are three places of catching specific
exceptions:

- On the toplevel of the compiler, to report all compile errors found
(sorted by source location) even if one of errors was fatal and
caused the compiler to abort by throwing an exception.

- When checking whether the interface file would change, an absent or
malformed interface file is treated as one to be changed (catching
IO_ERROR and CODING_ERROR).

- For working around a bug when it's compiled using an old version of
itself which did not define hashing of rational numbers, the method
redefinition exception is caught there and ignored.

In my Scheme interpreter (2800 lines) specific exceptions are caught
on the toplevel. The toplevel distinguishes an exception translated
from the SIGINT handler (to return to the toplevel) and PROGRAM_EXIT
(if a Scheme function which exits is called, the interpreter exits).
Other exceptions are printed for the user and the REPL continues.

In my Tetris game (520 lines) there are some specific exceptions being
caught:

- Attempting to write a message about the terminal being too small
may fail with CURSES_ERROR if it's too small even for that message.
This error is ignored instead of causing the program to abort,
the user only sees the beginning of the message.

- When an attempt to hide the cursor fails, the error is ignored.

- An exception is used internally by the construct of providing a
non-local exit. The construct is used for exiting the main loop
when the user presses "q".

I claim that they are legitimate uses of exceptions. It would be
inconvenient and inefficient to make error propagation explicit
(or even impossible in the case of the asynchronous SIGINT), and it
would be fragile if reasons for exceptions were encoded as strings.

> Isn't FPL programming about making dependencies explicit?

I don't religiously adhere to general principles without looking at
consequences at a specific case. Yes, usually it's better to make
dependencies more explicit, up to a point. Too much and it will do
more harm than good.

In particular many people claim that Java checked exceptions were
a mistake. The essence of exceptions is that they transparently
propagate through code which is executing between the point of
throwing and catching. Making the propagation explicit in types makes
hard to write reusable code which propagates outside different kinds
of exceptions depending on what is executed inside. It could be
improved by making possible to parametrize methods and classes by sets
of possible exceptions, e.g. expressing that this method throws the
union of exception types thrown by parameters of these two generic
classes, but then the language would get more complex. I'm happy with
leaving this implicit, as in almost all languages besides Java.

Panu Kalliokoski

unread,
Sep 30, 2005, 7:48:02 AM9/30/05
to
In <87oe6bq...@qrnik.zagroda> Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> writes:

>Panu Kalliokoski <ate...@sange.fi> writes:

>> deemed OCaml the language where type subsumption (of classes) is
>> _implicit_, based on the subsumption relation of their type
>> signatures.

[...]


>I insist on explicitness. The subtyping relation is implicit,
>but subsumption (upcasting) is explicit.

Okay, now I see what you're referring to. I took subsumption to be the
mathematical relation that says which type can be coerced to which, not
the actual action of coercion. The former is implicit in Ocaml, whereas
the latter is explicit, as you say.

>> What OCaml lacks is RTTI. But as you admit, languages with RTTI are
>> not really statically typed at all, so they hardly qualify as a
>> category of statically typed languages.
>That was my point: languages with RTTI merge statically typed and
>dynamically typed paradigms, which is their advantage, while those
>without are limited to pure static typing.

Okay. But they are then IMO semantically equivalent to
Lisp-with-a-sufficiently-smart-compiler. That is, dynamically typed,
period.

>- Making an operation declared for an open family for cases, which
> is then defined incrementally for each case, and dispatched when
> applied.

Maybe I'm dizzy today but I'm not sure what you're talking about. Maybe
a concrete example would help? (Or pointing to a former example, if one
was made already.)

>Things are not that black & white.

Well, I haven't yet understood how "static typing + RTTI" is not really
dynamic typing, totally.

Marcin 'Qrczak' Kowalczyk

unread,
Sep 30, 2005, 8:39:34 AM9/30/05
to
Panu Kalliokoski <ate...@sange.fi> writes:

> But they are then IMO semantically equivalent to
> Lisp-with-a-sufficiently-smart-compiler. That is, dynamically typed,
> period.

I won't fight over the terminology. Definitely it's closer to dynamic
typing than the functional type system is; in particular it can easily
emulate dynamic typing (modulo performance problems caused by the
inability of having unboxed integers and by lots of downcasts).

At the same time it's more static than typical dynamically typed
languages: the compiler will catch lots of type errors at compile
time in areas where the programming style used doesn't rely on that
"dynamic" typing, while in a true dynamically typed language a wrong
type is very rarely caught at compile time.

(I never liked these languages for other reasons, and jumped straight
to dynamic typing.)

>>- Making an operation declared for an open family for cases, which
>> is then defined incrementally for each case, and dispatched when
>> applied.
>
> Maybe I'm dizzy today but I'm not sure what you're talking about.
> Maybe a concrete example would help?

I was implementing an interpreter of a dynamically typed language
in OCaml, and the problem was in representing runtime values of the
embedded language in the host language. Requirements:

- Additional modules can wrap new types from the host language without
having to update a central type definition.

- When wrapping a type from the host language, it must be possible to
dynamically check whether the given wrapped value was constructed
from the given type, and if so, unwrap to give access to the
underlying value of the host language.

Rationale. For some libraries it's possible to live without this,
by wrapping the object in a record of functions which refer to the
value of the new type from their free variables, and exposing only
the external interface which uses only values of simple types. But
it's not sufficient for binary methods, and also it would force
to fix the set of exposed functions working on a given type when
exporting that type.

- In the embedded language you can attempt to apply a value to
argumenets. This may succeed not only for "functions", but also for
various other wrapped types. When wrapping a type, it is decided how
objects of this type implement application to arguments.

Getting a record field is unified with application to a symbol,
so it's used for all types which want to behave like records.

In one of versions of the interpreter I solved it this way:

type ktype = int (* Unique tag *)

type native
type obj =
| KFunction of (trace -> obj list -> obj list)
| KVariable of obj ref
| KInt of int
| KBigInt of Big_int.big_int
... etc. for various primitive types ...
| KNative of ktype * native * (trace -> native -> obj list -> obj list)

The KNative constructor is open for extensions. When a new kind of
extension is registered, it gets its unique type id and uses Obj.magic
to cast something to native. And then back, after checking the type id.

Application of an object to arguments dispatches on the primitive
types, and in the case of KNative it uses the stored function (which
is not a closure over the native value but gets it as an argument,
so it can be physically shared among all values of the given type).

In a newer version of the interpreter I chose a more uniform
representation of values:

type kobj = {desc : kobj desc}

and 'a desc = {
typeObj : ktype;
idOfType : int;
mutable apply : trace -> 'a -> kobj list -> kobj;
mutable apply0 : trace -> 'a -> kobj;
mutable apply1 : trace -> 'a -> kobj -> kobj;
mutable apply2 : trace -> 'a -> kobj -> kobj -> kobj;
mutable apply3 : trace -> 'a -> kobj -> kobj -> kobj -> kobj;
}

and ktype = {
typeDesc : ktype desc;
typeId : int;
mutable typeName : kobj;
mutable directSupertypes : ktype list
}

(Many "mutable" adjectives are needed only to close recursive knots in
core parts of the library.)

Then concrete types are represented by records with the first field of
type 'a desc for some 'a, using Obj.magic to cast them to kobj and back,
after checking for the expected typeObj (or redundant idOfType). This is
again unsafe. Application of an object to arguments jumps straight to
the application code in the descriptor, without pattern matching.

If this was implemented in .NET, I would have two choices:

- Have my interface or superclass which declares pure virtual
application methods. Let each concrete type implement this
interface. Declare all parameters using this interface.

- Since many primitive types don't have any interesting reaction to
application to arguments, have that interface only for objects which
need it, and otherwise pass native .NET objects in the Object type
without additional wrapping. Downcast for performing an application.

They would be both safe, i.e. they wouldn't rely on constructs which
have undefined behavior when misused like Obj.magic.

Joachim Durchholz

unread,
Sep 30, 2005, 8:44:06 AM9/30/05
to
Panu Kalliokoski schrieb:

> Well, I haven't yet understood how "static typing + RTTI" is not
> really dynamic typing, totally.

In my book, "static typing + RTTI" means: I can rely on the guarantees
of static typing in those parts of the program that don't use RTTI.

I'm not sure that I'm answering your question with that one though...

Regards,
Jo

Joachim Durchholz

unread,
Sep 30, 2005, 2:26:26 PM9/30/05
to
Marcin 'Qrczak' Kowalczyk schrieb:

> Joachim Durchholz <j...@durchholz.org> writes:
>
>>>Most languages support distinguishing the kind of failures, which is
>>>used for other things besides printing it to the user and exiting,
>>>and I think it's useful enough to be worth supporting.
>>
>>I disagree about that.
>
> In my compiler (13k lines) there are three places of catching specific
> exceptions:

Let me classify the examples.

> - On the toplevel of the compiler, to report all compile errors found
> (sorted by source location)

Catching and reporting all exceptions in the toplevel doesn't require an
extendible type hierarchy.

> even if one of errors was fatal and caused the compiler to abort by
> throwing an exception.

Again, to log an exception and ignore it otherwise, no extendible type
hierarchy is needed.

> - When checking whether the interface file would change, an absent or
> malformed interface file is treated as one to be changed (catching
> IO_ERROR and CODING_ERROR).

These exceptions need to be caught because the underlying API throw
them. These things could equally well be handled using discriminated
unions (sum types).

> - For working around a bug when it's compiled using an old version of
> itself which did not define hashing of rational numbers, the method
> redefinition exception is caught there and ignored.

Can't comment on that - working around a bug often requires all sorts of
hacks. Catching and ignoring the exception might be the most elegant way
out.
OTOH this means you may not be informed about legitimate method
redefinition problems in the entire call graph below that exception
handler. In other words, there's a potentially nonlocal property that
might have to be verified a year later - marginally problematic in a
13kloc project, but a /huge/ problem in a 100kloc project. (The problem
is that 13kloc projects tend to grow to 100kloc if they are successful.
I hope you could remove that exception-eating code after the successful
bootstrap, and am sympathetic if it's been impossible.)

> In my Scheme interpreter (2800 lines) specific exceptions are caught
> on the toplevel. The toplevel distinguishes an exception translated
> from the SIGINT handler (to return to the toplevel) and PROGRAM_EXIT
> (if a Scheme function which exits is called, the interpreter exits).
> Other exceptions are printed for the user and the REPL continues.

Hmm... can't really comment on that. I've been using the Windows model
where all event handling is synchronous by default.
I wouldn't want to do it that way, but that may be too biased by
personal experience to be interesting to anybody.

> In my Tetris game (520 lines) there are some specific exceptions being
> caught:
>
> - Attempting to write a message about the terminal being too small
> may fail with CURSES_ERROR if it's too small even for that message.
> This error is ignored instead of causing the program to abort,
> the user only sees the beginning of the message.
>
> - When an attempt to hide the cursor fails, the error is ignored.

These are classical cases of managing software failure. In both cases, a
low-level routine fails to render the promised service, so the
higher-level code chooses an alternate strategy (in these cases, simply
by doing nothing).

> - An exception is used internally by the construct of providing a
> non-local exit. The construct is used for exiting the main loop
> when the user presses "q".

Now that's the one case where I thing the exception should be avoided.

Sure, the code is a bit more convoluted. You have to carry the knowledge
about the exit through several call levels.
On the other hand, the code at intermediate call levels must be aware
that there's a shortcut running straight through it, that things that
are begun may not be finished. In other words, the additional hoops that
intermediate-level code has to jump through are justified, since they
serve as a reminder to any future maintainer that things are indeed that
complicated.
(Not a real issue in a 520loc program, of course, I'm extrapolating to
larger software here. For 520loc, even gotos are acceptable *g*.)

> I claim that they are legitimate uses of exceptions.

I hope I have made clear where and why I disagree.

> It would be
> inconvenient and inefficient to make error propagation explicit

Inconvenient - yes. Though, as said above, that's in places where things
*should* be inconvenient, IMO.

Inefficient? I don't think so. Discriminated unions of that kind can be
returned in processor register pairs, so if a language uses that as a
standard pattern, efficiency will almost automatically follow.

> (or even impossible in the case of the asynchronous SIGINT),

As I said, asynchronous signals aren't my area of expertise.

In the code that I've been maintaining, such signals were queued up and
processed in order, so the signal was processed just as any keyboard
input would have been. That has worked well enough for me. (It also
seems to work well enough for the Erlang guys, or at least so I infer
from the architectural choices they are presenting to the world.)

> and it
> would be fragile if reasons for exceptions were encoded as strings.

Not too much. It's really a question what you use exceptions for.

If you use them as a replacement for a non-local goto, then yes you need
all the information you can get, and string information isn't good
enough for that.
If you use them as a way to manage failure handling, you don't even
really need a failure code. Either the called code succeeds, in which
case nothing special need be done, or it fails, in which case the
calling code should retry, change strategy, or fail - and it doesn't
need specifics of the point of failure for either choice, and in fact it
*shouldn't* (have to) know specifics to keep the software decoupled.

>>Isn't FPL programming about making dependencies explicit?
>
> I don't religiously adhere to general principles without looking at
> consequences at a specific case. Yes, usually it's better to make
> dependencies more explicit, up to a point. Too much and it will do
> more harm than good.

Sure.

It's just that I found that the downsides of *not* using exceptions are
often overestimated.
I have programmed under the an-exception-is-failure paradigma for
several years, and I *never* felt the need to extend the error type
hierarchy. In fact there was only a single occasion where I had to write
code that differentiated between error classes, and that was fairly
low-level code, very near to the execution engine (when transferring
exception information from the host language to the 4GL interpreter on
top of it, IIRC).

> In particular many people claim that Java checked exceptions were
> a mistake. The essence of exceptions is that they transparently
> propagate through code which is executing between the point of
> throwing and catching. Making the propagation explicit in types makes
> hard to write reusable code which propagates outside different kinds
> of exceptions depending on what is executed inside.

That's a problem indeed.

My knee-jerk reaction was to say that I don't need to distinguish
exception classes anyway, so it's OK to say "throws Exception"
everywhere :-)

> It could be
> improved by making possible to parametrize methods and classes by sets
> of possible exceptions, e.g. expressing that this method throws the
> union of exception types thrown by parameters of these two generic
> classes, but then the language would get more complex. I'm happy with
> leaving this implicit, as in almost all languages besides Java.

Indeed :-)

If I were to design a language, my stance would be more simplicistic: no
exception classes, exception information tailored towards debugging and
post-mortem analysis, and an exception is a contract failure. And be
done with it - everything else can be fully handled using the normal
function call interfaces.

I know your mileage does vary ;-)

Regards,
Jo

Dirk Thierbach

unread,
Sep 30, 2005, 2:56:19 PM9/30/05
to
Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> wrote:
> I was implementing an interpreter of a dynamically typed language
> in OCaml, and the problem was in representing runtime values of the
> embedded language in the host language. Requirements:

> - Additional modules can wrap new types from the host language without
> having to update a central type definition.

Why do you want to wrap a type of the host language into the
interpreted language in the first place? The straightforward approach
is to implement dynamic types as they should be implemented, i.e. base
types and then generic type constructors like Lisp lists, with the
option to attach additional type identifiers. Maybe add some extra
construct to keep a wrapped dynamic type opaque in the interpreted
language, if it is important.

Any additional modules have to dispatch on the type anyway, so
you just add one pattern matching, and the conversion becomes
trivial. Static typing points out when you have to check, and
the compiler will tell you when you forgot a catch-all to throw
the dynamic error.

Maybe wrapping types directly gives a tiny bit of performance,
but that shouldn't be enough to justify a somewhat contorted
implementation.

- Dirk

George Neuner

unread,
Sep 30, 2005, 3:13:25 PM9/30/05
to
On Thu, 29 Sep 2005 15:24:06 GMT, David Hopwood
<david.nosp...@blueyonder.co.uk> wrote:

>For true believers of this religion, the point that there are ways
>to do metaprogramming just as easily without having the source code
>correspond as literally to an AST as s-expressions do, seems to fall
>on deaf ears.

Certainly it is possible, however I think the "true believers" would
question the claim that it can be done "just as easily".

I am not a "true believer" btw - I see utility in Lisp's syntax but
not necessarily beauty. And I don't care how meta programming
features are implemented as long as they can be used without
unnecessary gyration ... spinning makes me dizzy.

GW

Marcin 'Qrczak' Kowalczyk

unread,
Sep 30, 2005, 5:31:48 PM9/30/05
to
Joachim Durchholz <j...@durchholz.org> writes:

>> - On the toplevel of the compiler, to report all compile errors found
>> (sorted by source location)
>
> Catching and reporting all exceptions in the toplevel doesn't require
> an extendible type hierarchy.

I needed one new kind of exception here, specific to this program,
which means "the compiler found a fatal error and it cannot continue
compiling this module, so we return to compiler's toplevel code and
print errors found so far (only one of them was fatal) before exiting".

Of course the similar behavior could be obtained differently:

- Print all errors when the fatal error is found and exit the process
in a hard way.

(Exiting the process is itself performed by throwing an exception,
so cleanup code of active "brackets" of explicit finalization is run,
and thus this wouldn't really change anything except demodularizing
the error dumping code and perhaps failing to free some resources.)

- Design the compiler such that all errors be non-fatal, attempt to
continue in any case.

(This is indeed what happens most of the time. I still use fatal
errors for improbable cases where I didn't want to bother with
inventing an error result: when the compiler has detected violation
of certain invariants of the data it is manipulating, i.e. found a
bug in itself, or when the interface file - normally written by a
previous compilation, not by a human - could not be parsed.)

I claim that a distinguished exception used for jumping from the point
of detection of an unusual error to the point where compilation errors
are reported is cleaner.

Note that I don't use exceptions much. An average Java program of the
same size probably has more 'try' statements. But I definitely consider
exceptions important enough to have a dedicated language mechanism.

>> - When checking whether the interface file would change, an absent or
>> malformed interface file is treated as one to be changed (catching
>> IO_ERROR and CODING_ERROR).
>
> These exceptions need to be caught because the underlying API throw
> them. These things could equally well be handled using discriminated
> unions (sum types).

And let each I/O operation return a discriminated union of the
expected result and an error? No way, very inconvenient. Even Haskell
uses exceptions for that, and even they are hardwired in the IO monad
instead of being a syntactic sugar over explicit status checking.

My language Kogut does use a distinguished result instead of an
exception for reporting an end of file though.

>> - For working around a bug when it's compiled using an old version of
>> itself which did not define hashing of rational numbers, the method
>> redefinition exception is caught there and ignored.
>
> Can't comment on that - working around a bug often requires all sorts
> of hacks. Catching and ignoring the exception might be the most
> elegant way out.
> OTOH this means you may not be informed about legitimate method
> redefinition problems in the entire call graph below that exception
> handler.

It includes only that single method definition, no problem here.

> Hmm... can't really comment on that. I've been using the Windows
> model where all event handling is synchronous by default.

No matter what the OS is, when the program performs a mathematical
computation programmed by the user or on data provided by the user,
the user might want to abort it when the computation seems to take
too much time, perhaps by pressing some "Cancel" button. This includes
e.g. image processing.

> Inefficient? I don't think so. Discriminated unions of that kind can
> be returned in processor register pairs, so if a language uses that as
> a standard pattern, efficiency will almost automatically follow.

Inefficient because the result would have to be checked after each
function call. Most implementations of exceptions have zero overhead
for normal expression evaluation when exceptions don't occur.

Matthew D Swank

unread,
Sep 30, 2005, 6:18:19 PM9/30/05
to
On Fri, 30 Sep 2005 11:48:02 +0000, Panu Kalliokoski wrote:


> Well, I haven't yet understood how "static typing + RTTI" is not really
> dynamic typing, totally.
>
> Panu

Because you _choose_ syntactically to subvert the type system. Static
type checking is based on reasoning about syntax. If you don't upcast
or dynamically dispatch on types, etc., it's statically typed. You can act
as if the rtti isn't there.

In Lisp* all the type information about values is resolved at runtime
by default. You _have_ to use the runtime to detect type errors.

Matt

*excluding Common Lisp compilers that treat declarations as compile-time
assertions

--
"You do not really understand something unless you can
explain it to your grandmother." — Albert Einstein.

Joachim Durchholz

unread,
Sep 30, 2005, 6:29:30 PM9/30/05
to
Marcin 'Qrczak' Kowalczyk schrieb:

> Joachim Durchholz <j...@durchholz.org> writes:
>
>>>- On the toplevel of the compiler, to report all compile errors found
>>> (sorted by source location)
>>
>>Catching and reporting all exceptions in the toplevel doesn't require
>>an extendible type hierarchy.
>
> I needed one new kind of exception here, specific to this program,
> which means "the compiler found a fatal error and it cannot continue
> compiling this module, so we return to compiler's toplevel code and
> print errors found so far (only one of them was fatal) before exiting".

That's new behavior, not a new exception type.

> Of course the similar behavior could be obtained differently:
>
> - Print all errors when the fatal error is found and exit the process
> in a hard way.

Not what I'd design.

> (Exiting the process is itself performed by throwing an exception,
> so cleanup code of active "brackets" of explicit finalization is run,
> and thus this wouldn't really change anything except demodularizing
> the error dumping code and perhaps failing to free some resources.)

Resource freeing can be an issue independently of whether exceptions are
in a type hierarchy, so that doesn't decide the point in question.

> - Design the compiler such that all errors be non-fatal, attempt to
> continue in any case.
>
> (This is indeed what happens most of the time.

OK, just the way I'd design it.

> I still use fatal
> errors for improbable cases where I didn't want to bother with
> inventing an error result: when the compiler has detected violation
> of certain invariants of the data it is manipulating, i.e. found a
> bug in itself, or when the interface file - normally written by a
> previous compilation, not by a human - could not be parsed.)

Again, that's exactly what I'd do: these are all software failures of
some kind. If an invariant is broken, there isn't much point in trying
to find out what exactly broke (diagnosing itself goes far beyond the
mission statement of any compiler), so declare failure and abort
compilation.

> I claim that a distinguished exception used for jumping from the point
> of detection of an unusual error to the point where compilation errors
> are reported is cleaner.

Now my claim is that you're already following the "an exception is a
failure" philosophy, though you may be doing it for pragmatic reasons ;-)

My /counterclaim/ is that you don't need a distinguished exception.
Whether the problem that the compiler hits is broken invariants in
intermediate files, or in its internal data structures, or an unhandled
divide-by-zero, it's always the same: the compiler must quit whatever
it's doing.

That it emits all accumulated error information is a good fallback
strategy, and useful regardless of the actual cause of a breakdown.

> Note that I don't use exceptions much. An average Java program of the
> same size probably has more 'try' statements. But I definitely consider
> exceptions important enough to have a dedicated language mechanism.

Agreed. Exceptions are a useful tool.

It's just that I don't see a dynamically extendible exception hierarchy
as not very useful. Actually I think it promotes nonlocal dependencies,
because exceptions tend to expose local information to an entirely
different abstraction layer.

>>>- When checking whether the interface file would change, an absent or
>>> malformed interface file is treated as one to be changed (catching
>>> IO_ERROR and CODING_ERROR).
>>
>>These exceptions need to be caught because the underlying API throw
>>them. These things could equally well be handled using discriminated
>>unions (sum types).
>
> And let each I/O operation return a discriminated union of the
> expected result and an error? No way, very inconvenient. Even Haskell
> uses exceptions for that, and even they are hardwired in the IO monad
> instead of being a syntactic sugar over explicit status checking.

I have done it the other way round. In a (yuck!) imperative OO language.

For example, file error handling. A File object would remember its
status, and when it hit the first I/O error, it would simply set an
Error flag and ignore any future requests. That flag would have to be
cleared explicitly - so higher-level abstractions would have the choice
to either emit a series of operations and check the error status for the
entire sequence, or to check each single operation, whatever was more
appropriate.

I'm pretty sure something similar could be done for Haskell's IO monad,
and certainly for OCaml/SML/MFTL.

> My language Kogut does use a distinguished result instead of an
> exception for reporting an end of file though.

Actually it was the other way round for the package I was working with:
reading past end-of-file was considered a breach of the Read procedure's
precondition, so that raised a "precondition broken" exception.

Programming with that wasn't that bad, mostly due to an OK library
design. Not being allowed to read past EOF isn't such a problem if every
file will always give you the minimum number of bytes that are still
availalbe.

>>Hmm... can't really comment on that. I've been using the Windows
>>model where all event handling is synchronous by default.
>
> No matter what the OS is, when the program performs a mathematical
> computation programmed by the user or on data provided by the user,
> the user might want to abort it when the computation seems to take
> too much time, perhaps by pressing some "Cancel" button. This includes
> e.g. image processing.

On a Windows platform, programs usually have a single thread that reads
and handled messages as they arrive. The "Cancel" request gets processed
with that message just as anything else.

If you need an architecture where long-running tasks must be separately
interruptible, you need multiple threads. One thread does the GUI, the
other the image processing (good example). The GUI thread may "shoot"
the processing thread, which would arrive as an exception which (if not
caught) will abort the processing thread.

Nothing really difficult here. The Erlang folks have been doing it that
way all the time, and they seem to be very happy with that design.

>>Inefficient? I don't think so. Discriminated unions of that kind can
>>be returned in processor register pairs, so if a language uses that as
>>a standard pattern, efficiency will almost automatically follow.
>
> Inefficient because the result would have to be checked after each
> function call. Most implementations of exceptions have zero overhead
> for normal expression evaluation when exceptions don't occur.

Ah, I see.

Agreed with that one.

It's a price I'd be willing to pay for a cleaner program structure.

If I don't want to check for error results after every step (and I
certainly don't!), I'd follow the "NaN route" from floating-point
arithmetic: make the functions accept the error values, and have them
"do nothing" and return an error value if they are fed with an error
value. (It's essentially a variant of the "file error status" strategy,
described above.)

Regards,
Jo

Marcin 'Qrczak' Kowalczyk

unread,
Sep 30, 2005, 7:47:44 PM9/30/05
to
Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

>> - Additional modules can wrap new types from the host language
>> without having to update a central type definition.
>
> Why do you want to wrap a type of the host language into the
> interpreted language in the first place?

Consider e.g. a library in the host language for manipulating images,
with operations to create images in memory, draw on them using tools
analogous to interactive image manipulation, and save them in files.

When made available to a different language, objects of the source
language storing image data should be wrapped in the target language,
such that calling certain functions of the target language unwrap the
original source language objects and apply the corresponding source
language functions to them. Or translate from analogous types instead
of unwrapping in case of simple values like integers.

> Any additional modules have to dispatch on the type anyway, so
> you just add one pattern matching, and the conversion becomes
> trivial.

I don't understand. I don't want to make a huge algebraic type in the
host language with constructors corresponding to all possible types of
the embedded language, because it's unmodular: adding another library
requires updating an existing type definition and updating functions
which dispatch a generic function on them. Libraries providing new
types should be self-contained and distributable separately from the
main interpreter.

Marcin 'Qrczak' Kowalczyk

unread,
Sep 30, 2005, 8:13:06 PM9/30/05
to
Joachim Durchholz <j...@durchholz.org> writes:

>> I needed one new kind of exception here, specific to this program,
>> which means "the compiler found a fatal error and it cannot continue
>> compiling this module, so we return to compiler's toplevel code and
>> print errors found so far (only one of them was fatal) before exiting".
>
> That's new behavior, not a new exception type.

The code near the toplevel which reacts to this non-local exit
exception distinguishes it from other exceptions, e.g. from really
unexpected exceptions which should be reported with a stack trace
(like taking the first element of an empty sequence), or from the
exception generated by ^C which is propagated to the ultimate toplevel
and signalled to the OS so that the parent process will be able to
determine that our process was terminated by the signal.

> My /counterclaim/ is that you don't need a distinguished exception.

We can't predefine all possible kinds of exceptions. There are lots of
domain-specific exceptions introduced by various libraries, like zlib
reporting about malformed compressed data, or a database library
reporting about insufficient permissions. The kind of the problem
determines how and when it is reacted upon, what to abort and what
to restart, is it a bug in our program or should we try to continue.

> For example, file error handling. A File object would remember its
> status, and when it hit the first I/O error, it would simply set an
> Error flag and ignore any future requests.

This is how it worked in Turbo Pascal. I don't like this. It's too
easy to fail to check the status often enough, and then the program
will unnecessarily attempt to continue despite its I/O being partially
redirected to nowhere, failing to notice that there is a problem.

> If I don't want to check for error results after every step (and I
> certainly don't!), I'd follow the "NaN route" from floating-point
> arithmetic: make the functions accept the error values, and have
> them "do nothing" and return an error value if they are fed with an
> error value. (It's essentially a variant of the "file error status"
> strategy, described above.)

The problem with that is that after a long computation I might get
this nan result, and have no idea where it originated. An uncaught
exception allows the programmer to see a stack trace.

Joachim Durchholz

unread,
Oct 1, 2005, 2:59:53 AM10/1/05
to
Marcin 'Qrczak' Kowalczyk schrieb:

> Joachim Durchholz <j...@durchholz.org> writes:
>
>>>I needed one new kind of exception here, specific to this program,
>>>which means "the compiler found a fatal error and it cannot continue
>>>compiling this module, so we return to compiler's toplevel code and
>>>print errors found so far (only one of them was fatal) before exiting".
>>
>>That's new behavior, not a new exception type.
>
> The code near the toplevel which reacts to this non-local exit
> exception distinguishes it from other exceptions, e.g. from really
> unexpected exceptions which should be reported with a stack trace
> (like taking the first element of an empty sequence), or from the
> exception generated by ^C which is propagated to the ultimate
> toplevel and signalled to the OS so that the parent process will be
> able to determine that our process was terminated by the signal.

Maybe I'm dense, but I fail to see the difference between "unexpected
situations" (failed contracts on internal data) and "really unexpected
situations" (failed contracts when using underlying infrastructure).

From the perspective of the end user, I mean.

>>My /counterclaim/ is that you don't need a distinguished exception.
>
> We can't predefine all possible kinds of exceptions. There are lots of
> domain-specific exceptions introduced by various libraries, like zlib
> reporting about malformed compressed data, or a database library
> reporting about insufficient permissions. The kind of the problem
> determines how and when it is reacted upon, what to abort and what
> to restart, is it a bug in our program or should we try to continue.

That argument doesn't hold, at least not from my perspective.

If exceptions thrown are from an external library, there are two
possible ways that I can react to them:
1) It's unexpected, and the layer that called it should fail. Package
the external-library exception data into a format that developers can
use and raise an exception (or, if external library and calling layer
are in the same exception-handling framework, simply don't catch the
exception).
2) It's expected, an indication of some anticipated situation. Inspect
the exception data, and do whatever is appropriate for that situation.

Either case, the calling software layer doesn't need to propagate the
detail information of the exception.

>>For example, file error handling. A File object would remember its
>>status, and when it hit the first I/O error, it would simply set an
>>Error flag and ignore any future requests.
>
> This is how it worked in Turbo Pascal. I don't like this. It's too
> easy to fail to check the status often enough, and then the program
> will unnecessarily attempt to continue despite its I/O being partially
> redirected to nowhere, failing to notice that there is a problem.

So what? A few unnecessary calls if an unanticipated error situation
comes up aren't a big issue. It's the normal case that needs to be
efficient, not the unanticipated one.

(If the errors are anticipated, you'd have to check every call anyway.
If that's too inconvenient, you'd probably wrap the I/O calls into a
layer that does the checking.)

>>If I don't want to check for error results after every step (and I
>>certainly don't!), I'd follow the "NaN route" from floating-point
>>arithmetic: make the functions accept the error values, and have
>>them "do nothing" and return an error value if they are fed with an
>>error value. (It's essentially a variant of the "file error status"
>>strategy, described above.)
>
> The problem with that is that after a long computation I might get
> this nan result, and have no idea where it originated. An uncaught
> exception allows the programmer to see a stack trace.

If you want to pinpoint the exact place where the NaN originated, you'd
sprinkle your code with assertions that say "not isNaN (foo)". That way,
unanticipated NaNs are easily turned into early failures.

Regards,
Jo

Panu Kalliokoski

unread,
Oct 1, 2005, 3:59:48 AM10/1/05
to
In <pan.2005.09.30....@c.net> Matthew D Swank <akopa-is-very-much-...@c.net> writes:
>> Well, I haven't yet understood how "static typing + RTTI" is not really
>> dynamic typing, totally.
>Because you _choose_ syntactically to subvert the type system. Static
>type checking is based on reasoning about syntax. If you don't upcast
>or dynamically dispatch on types, etc., it's statically typed. You can act
>as if the rtti isn't there.

Well, then, Lisp without use of type predicates is statically typed. (I
know such programs exist; in fact, there are many production-size Lisp
programs without type predicates.)

Even if _your_ program does not use RTTI, some library that binds to
your code might use it. So the compiler does not really know, in the
general case, that it can statically type a program which does not use
RTTI.

>In Lisp* all the type information about values is resolved at runtime
>by default. You _have_ to use the runtime to detect type errors.

What? I'm no Common Lisp expert, but I thought there was something
called |declare|. Besides, macros can be used to code-walk all
invocations of functions to check the type of the arguments at
compile-time.

I have hard time believing that this static/dynamic typing discussion
would really be about having syntax to declare the types of specific
expressions. I think a program is statically typed when, and only when,
it does not have to check the type of any expression at run time.
Statically typed languages are languages where every program is
statically typed. Maybe this definition is too extreme, but I urge you
all to come up with better definitions for "statically typed program /
language".

Marcin 'Qrczak' Kowalczyk

unread,
Oct 1, 2005, 5:39:02 AM10/1/05
to
Panu Kalliokoski <ate...@sange.fi> writes:

> Well, then, Lisp without use of type predicates is statically typed.

It's not, because compilers don't reject programs merely because they
can't prove them to use types consistently.

> What? I'm no Common Lisp expert, but I thought there was something
> called |declare|.

<http://www.lisp.org/HyperSpec/Body/dec_type.html>:

"During the execution of any reference to the declared variable within
the scope of the declaration, the consequences are undefined if the
value of the declared variable is not of the declared type."

Compilers might try to infer some types statically and warn for
potential mismatches, but the ordinary semantics is that the
programmer promises to use the given type here, and the compiler
might take advantage of this for optimization (even if the assumption
turns out to be false), amd might check that at runtime when debugging
options are turned on. Or not.

The CL type language is too limited to express many constructs which
can be statically typed in other languages. I think you can't express
parametric polymorphism, nor even a homogeneous list of elements of a
concrete type. I seems to not have been intended for finding errors at
compile time; it was intended for optimization: eliminating runtime
dispatch and unboxing storage.

> Besides, macros can be used to code-walk all invocations of
> functions to check the type of the arguments at compile-time.

I don't believe it's practical to do that. The original CL type system
is not expressive enough, macros are given too little tools for
manipulation of types (I think they can't even expand type aliases),
a macro can't see all callers of a function, a macro can't see types of
standard functions nor walk their source, and standard functions don't
even have to have sensible static types. What is the type of COERCE?
http://www.lisp.org/HyperSpec/Body/fun_coerce.html

> I think a program is statically typed when, and only when, it does
> not have to check the type of any expression at run time.

It's not that black & white. First, because there is no
language-independent definition of a type. For example in Perl if you
consider only scalars, arrays, hashes, subroutines and IO handles,
then it's statically typed. If you distinguish references blessed for
a particular package as distinct types of scalars, then it's not.

Second, there is much between checking no types and checking all
types. Is OCaml statically typed? Its implementation of = checks types
at runtime (at least some of them, e.g. floats and strings), and its
array operations distinguish float arrays at runtime (unless the
compiler has determined that a given array for sure is or is not a
float array).

Is C statically typed? It doesn't give the programmer tools for
checking the types, it only follows pointers, calls functions etc.
You have to implement RTTI yourself if needed. Is working with void *
and casting it to the expected type statically typed programming? The
compiler doesn't have to check the type of any expression at runtime.

Is C++ less statically typed than C? Yes, it provides some RTTI. But
its type system is more expressive (templates, subtyping), so there
are fewer cases when you have to step outside it and work with void *.

To be constructive, I think the distinction between mostly static
typing and mostly dynamic typing is almost clear after we decide
what to call the type, which is somewhat arbitrary. A language is
statically typed if it routinely and predictably rejects programs
which can't be proved to follow its typing rules. It's dynamically
typed if it accepts unsure programs by default, and at most tries
to warn when types look deeply suspicious.

Some statically typed languages make easy to transliterate dynamically
typed programs to them, while reusing all libraries of the original
language (the OO family). Others make this hard, forcing to make
wrappers for all existing types to be supported (the functional family).
OCaml's exn and Glasgow Haskell's Dynamic are somewhat in-between:
allow open extension but not as conveniently as in OO languages.

Perl feels dynamically typed when using it, and Java feels statically
typed. Even though in older Java the type system is poor and one has
to resort to dynamic typing quite often. The intuition was there
before formalisms, so any definition should make similar decisions
in order to be useful. Another definition could distinguish Java and
Haskell and say that the former is more dynamically typed, as long as
it distinguishes also Java from Scheme, having more categories than
two.

Marcin 'Qrczak' Kowalczyk

unread,
Oct 1, 2005, 5:52:14 AM10/1/05
to
Panu Kalliokoski <ate...@sange.fi> writes:

> Well, then, Lisp without use of type predicates is statically typed.

It's not, because compilers don't reject programs merely because they


can't prove them to use types consistently.

> What? I'm no Common Lisp expert, but I thought there was something
> called |declare|.

<http://www.lisp.org/HyperSpec/Body/dec_type.html>:

"During the execution of any reference to the declared variable within
the scope of the declaration, the consequences are undefined if the
value of the declared variable is not of the declared type."

Compilers might try to infer some types statically and warn for
potential mismatches, but the ordinary semantics is that the
programmer promises to use the given type here, and the compiler
might take advantage of this for optimization (even if the assumption
turns out to be false), amd might check that at runtime when debugging
options are turned on. Or not.

The CL type language is too limited to express many constructs which
can be statically typed in other languages. I think you can't express
parametric polymorphism, nor even a homogeneous list of elements of a
concrete type. I seems to not have been intended for finding errors at
compile time; it was intended for optimization: eliminating runtime
dispatch and unboxing storage.

> Besides, macros can be used to code-walk all invocations of


> functions to check the type of the arguments at compile-time.

I don't believe it's practical to do that. The original CL type system


is not expressive enough, macros are given too little tools for
manipulation of types (I think they can't even expand type aliases),
a macro can't see all callers of a function, a macro can't see types of
standard functions nor walk their source, and standard functions don't
even have to have sensible static types. What is the type of COERCE?
http://www.lisp.org/HyperSpec/Body/fun_coerce.html

> I think a program is statically typed when, and only when, it does


> not have to check the type of any expression at run time.

It's not that black & white. First, because there is no


language-independent definition of a type. For example in Perl if you
consider only scalars, arrays, hashes, subroutines and IO handles,
then it's statically typed. If you distinguish references blessed for
a particular package as distinct types of scalars, then it's not.

Is checking the exn constructor in OCaml dynamic typing? I know OCaml
doesn't call them types, but they correspond to types in other languages.
We could declare any feature to be statically typed merely by calling
the thing it checks something other than the type. Any language becomes
statically typed if we distinguish only one type for everything.

Second, there is much between checking no types and checking all
types. Is OCaml statically typed? Its implementation of = checks types
at runtime (at least some of them, e.g. floats and strings), and its
array operations distinguish float arrays at runtime (unless the
compiler has determined that a given array for sure is or is not a
float array).

Is C statically typed? It doesn't give the programmer tools for
checking the types, it only follows pointers, calls functions etc.
You have to implement RTTI yourself if needed. Is working with void *
and casting it to the expected type statically typed programming? The
compiler doesn't have to check the type of any expression at runtime.

Is C++ less statically typed than C? This would be counter-intuitive.


Yes, it provides some RTTI. But its type system is more expressive
(templates, subtyping), so there are fewer cases when you have to

step outside it and work with void *. It's not fair to call C more
statically typed than C++.

To be constructive, I think the distinction between mostly static
typing and mostly dynamic typing is almost clear after we decide
what to call the type, which is somewhat arbitrary. A language is
statically typed if it routinely and predictably rejects programs
which can't be proved to follow its typing rules. It's dynamically
typed if it accepts unsure programs by default, and at most tries
to warn when types look deeply suspicious.

Some statically typed languages make easy to transliterate dynamically
typed programs to them, while reusing all libraries of the original
language (the OO family). Others make this hard, forcing to make
wrappers for all existing types to be supported (the functional family).
OCaml's exn and Glasgow Haskell's Dynamic are somewhat in-between:
allow open extension but not as conveniently as in OO languages.

Perl feels dynamically typed when using it, and Java feels statically
typed. Even though in older Java the type system is poor and one has
to resort to dynamic typing quite often. The intuition was there
before formalisms, so any definition should make similar decisions
in order to be useful. Another definition could distinguish Java and
Haskell and say that the former is more dynamically typed, as long as
it distinguishes also Java from Scheme, having more categories than
two.

--

Joachim Durchholz

unread,
Oct 1, 2005, 5:59:25 AM10/1/05
to
Panu Kalliokoski schrieb:

> I think a program is statically typed when, and only when,
> it does not have to check the type of any expression at run time.

It's a continuum.

Some part of a program may be statically typed, and other parts may be
dynamically typed.

E.g. code like

if foo is-a SomeType then
... code assuming foo is of type SomeType ...
endif

is dynamically typed in the condition, but would reasonably be
considered statically typed in the "then" branch.

> Statically typed languages are languages where every program is
> statically typed. Maybe this definition is too extreme, but I urge you
> all to come up with better definitions for "statically typed program /
> language".

Just apply the definition to program sections.

I doubt there's any language that's fully statically typed and useful in
practice. Even those languages that were designed to be statically typed
sooner or later acquired loopholes for circumventing the type system,
though the level of control applied to such loopholes varies. (You have
to "ask for permission" to even import the typecasting facilities into a
module in Ada, while it wasn't even explicitly mentioned in early
versions of C.)

Besides, as type systems become more powerful to cover additional
properties of a program, language designers have to go more out of their
way to avoid having more needs for such loopholes. Most if not all
statically-typed OO languages need downcasts, something that was
entirely unnecessary for languages with a less expressive type system
(e.g. Algol, Pascal, or - with some caveats - C).

Regards,
Jo

Marcin 'Qrczak' Kowalczyk

unread,
Oct 1, 2005, 6:45:23 AM10/1/05
to
Joachim Durchholz <j...@durchholz.org> writes:

> Maybe I'm dense, but I fail to see the difference between "unexpected
> situations" (failed contracts on internal data) and "really unexpected
> situations" (failed contracts when using underlying infrastructure).
>
> From the perspective of the end user, I mean.

For the end user they are the same. But they are handled differently.

When the former problem is detected, it logs the error using the
generic mechanism of the compiler for gathering errors, possibly
associated with a source location, and the exception is used only
to transfer the control flow.

The latter problem is detected in parts not specific to the compiler,
and the only information about the reason is carried by the exception
itself (and the stack trace).

The distinctive property of these fatal errors is not that they are
bugs in the compiler, but that compilation cannot continue after they
are found. They only happen to almost coincide currently.

[...]


> Either case, the calling software layer doesn't need to propagate
> the detail information of the exception.

I don't understand. Consider the well known function Map which applies
a function to each element of a sequence and gathers results. If some
application fails, the entire Map call fails with the same exception.
Was it expected by Map? It didn't expected any particular exception,
but it did expect that some application might fail. And the exception
is propagated.

Exceptions are useful when the cause of the problem is separated from
the point when it can be handled by some other code. Without exceptions
I would have to have a separate variant of Map which allows the mapped
function to signal that it wishes to abort processing.

>> This is how it worked in Turbo Pascal. I don't like this. It's too
>> easy to fail to check the status often enough, and then the program
>> will unnecessarily attempt to continue despite its I/O being partially
>> redirected to nowhere, failing to notice that there is a problem.
>
> So what? A few unnecessary calls if an unanticipated error situation
> comes up aren't a big issue. It's the normal case that needs to be
> efficient, not the unanticipated one.

It's not just efficiency but performing other interaction with the
world which shouldn't take place after an error. Besides, it would
be applicable only to reading and writing files; what about other
filesystem operations, like renaming a file or listing directory
contents? In these cases there is no file object to store the error
flag in.

Dirk Thierbach

unread,
Oct 1, 2005, 4:30:03 AM10/1/05
to
Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> wrote:

> Consider e.g. a library in the host language for manipulating images,
> with operations to create images in memory, draw on them using tools
> analogous to interactive image manipulation, and save them in files.

> When made available to a different language, objects of the source
> language storing image data should be wrapped in the target language,
> such that calling certain functions of the target language unwrap the
> original source language objects and apply the corresponding source
> language functions to them. Or translate from analogous types instead
> of unwrapping in case of simple values like integers.

Then store the image internally in an hash table, and only use an
appropriate handle in the interpreted language, that is then mapped
to the real data in the library. Existing dynamically typed languages
like Tcl, which are made among other things to easily integrate C
libraries into the language, use exactly this technique.

>> Any additional modules have to dispatch on the type anyway, so
>> you just add one pattern matching, and the conversion becomes
>> trivial.

> I don't understand. I don't want to make a huge algebraic type in the
> host language with constructors corresponding to all possible types of
> the embedded language,

Yes, and you don't have to. You make a comparetively simple algebraic
type, never change it, and use this everywhere. You cannot "wrap" all
the types of the host language in this way, but you don't need to.

> Libraries providing new types should be self-contained and
> distributable separately from the main interpreter.

Certainly.

- Dirk

Marcin 'Qrczak' Kowalczyk

unread,
Oct 1, 2005, 9:04:14 AM10/1/05
to
Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

> Then store the image internally in an hash table, and only use an
> appropriate handle in the interpreted language, that is then mapped
> to the real data in the library.

It wouldn't allow the data to be automatically freed when the GC
of the target language determines that it no longer needs it.

And it needs reinventing RTTI in the target language for meaningful
errors when types of wrapped objects are mismatched.

Joachim Durchholz

unread,
Oct 1, 2005, 9:43:04 AM10/1/05
to
Marcin 'Qrczak' Kowalczyk schrieb:

> Joachim Durchholz <j...@durchholz.org> writes:
>
>>Maybe I'm dense, but I fail to see the difference between "unexpected
>>situations" (failed contracts on internal data) and "really unexpected
>>situations" (failed contracts when using underlying infrastructure).
>>
>> From the perspective of the end user, I mean.
>
> For the end user they are the same. But they are handled differently.
>
> When the former problem is detected, it logs the error using the
> generic mechanism of the compiler for gathering errors, possibly
> associated with a source location, and the exception is used only
> to transfer the control flow.

Hmm... obviously I don't enough details to comment on your framework.
The best I can do is to compare with projects I have been involved with
- and it never occurred that there was a *need* to handle errors
differently.
Might be an architectural thing. If you design things to handle errors
uniformly, the software design process adapts itself so that things work
out well.

> The latter problem is detected in parts not specific to the compiler,
> and the only information about the reason is carried by the exception
> itself (and the stack trace).
>
> The distinctive property of these fatal errors is not that they are
> bugs in the compiler, but that compilation cannot continue after they
> are found. They only happen to almost coincide currently.

Um... there doesn't seem to be a distinction there. In both cases,
compilation is aborted.

From what I gather, the only distinction is that inconsistencies get
annotated with additional information about where in the program source
the error was detected. Which strikes me as being worth something to you
(as a developer), but not very interesting for a user.

>>Either case, the calling software layer doesn't need to propagate
>>the detail information of the exception.
>
> I don't understand. Consider the well known function Map which applies
> a function to each element of a sequence and gathers results. If some
> application fails, the entire Map call fails with the same exception.
> Was it expected by Map? It didn't expected any particular exception,
> but it did expect that some application might fail. And the exception
> is propagated.

You need an exception trace, i.e. a list of all functions that the
exception propagated through when it was caught.
The exception trace is almost identical to a stack trace most of the
time, though there can be complications if another exception occurs
during exception handling, or if code handles an exception and throws
another one (in the latter case, it's still useful to carry information
on the original exception around).

However, this exception trace is just debugging information and can be
carried in string (text) form. The software isn't supposed to base
decisions on the contents of the exception trace.

If Map doesn't catch the exception (and indeed it shouldn't normally),
it will be aborted by the exception, up the call stack until there's an
exception handler. Which, usually, will print or store the exception
trace in a place that a developer can see, then goes on to decide what
to do with the failure (abort, retry, try something different).

> Exceptions are useful when the cause of the problem is separated from
> the point when it can be handled by some other code. Without exceptions
> I would have to have a separate variant of Map which allows the mapped
> function to signal that it wishes to abort processing.

If an exception is a software failure, Map cannot succeed if any of the
applications fail. So you don't need a second version of Map, Map simply
shouldn't catch exceptions.

If there's still value in a partially-successful map, then either the
detail function should be rewritten to catch its own exceptions
(something that should only be done at either high or low levels of the
software), or some Fold should be used to accumulate error information
while the mapping progresses.

>>>This is how it worked in Turbo Pascal. I don't like this. It's too
>>>easy to fail to check the status often enough, and then the program
>>>will unnecessarily attempt to continue despite its I/O being partially
>>>redirected to nowhere, failing to notice that there is a problem.
>>
>>So what? A few unnecessary calls if an unanticipated error situation
>>comes up aren't a big issue. It's the normal case that needs to be
>>efficient, not the unanticipated one.
>
> It's not just efficiency but performing other interaction with the
> world which shouldn't take place after an error. Besides, it would
> be applicable only to reading and writing files; what about other
> filesystem operations, like renaming a file or listing directory
> contents? In these cases there is no file object to store the error
> flag in.

Either the problem is a failure. In that case, throwing an exception and
aborting is the best solution - and in fact that's what I'm advocating
(I'm advocating simplicity in exceptions, not abolishing them!).
Or the problem is something that might happen. In that case, throwing an
exception isn't a good solution - you might be aborting all kinds of
higher-level transactions you don't know about. Or, the higher-level
transactions would have to write exception handlers to make sure that
their transactions are completed, and they'd have to inspect any
exception that percolates up and check whether it's a failure or
something they're supposed to handle. Which means they have to know what
kinds of exceptions might come from the bowels of the lower layers,
which means you have dependencies across abstraction layers. And that's
really bad in my book.

Regards,
Jo

David Hopwood

unread,
Oct 1, 2005, 11:28:40 AM10/1/05
to
Panu Kalliokoski wrote:
> I have hard time believing that this static/dynamic typing discussion
> would really be about having syntax to declare the types of specific
> expressions. I think a program is statically typed when, and only when,
> it does not have to check the type of any expression at run time.
> Statically typed languages are languages where every program is
> statically typed.

That's a property of a language implementation, not a language.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Dirk Thierbach

unread,
Oct 1, 2005, 1:21:56 PM10/1/05
to
Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> wrote:
> Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

>> Then store the image internally in an hash table, and only use an
>> appropriate handle in the interpreted language, that is then mapped
>> to the real data in the library.

> It wouldn't allow the data to be automatically freed when the GC
> of the target language determines that it no longer needs it.

That's why you have hooks in the GC in many languages. And for some
types freeing data won't be enough, anyway; the library might have
to release other external ressources (close windows, release devices,
whatever).

> And it needs reinventing RTTI in the target language for meaningful
> errors when types of wrapped objects are mismatched.

Why? Wrapped objects are opaque. You have to dispatch on them in
the library, anyway. And that's where you throw meaningful errors.
Simple and straightforward.

I am not saying wrapping any static type into a single special static
type cannot be useful. Haskell has Dynamic to do that, for example.

But there are other ways to do it, too. And none of these ways really
needs to be as complicated as your approach.

- Dirk

Marcin 'Qrczak' Kowalczyk

unread,
Oct 1, 2005, 3:15:07 PM10/1/05
to
Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

>>> Then store the image internally in an hash table, and only use an
>>> appropriate handle in the interpreted language, that is then mapped
>>> to the real data in the library.
>
>> It wouldn't allow the data to be automatically freed when the GC
>> of the target language determines that it no longer needs it.
>
> That's why you have hooks in the GC in many languages.

This means that it would have to be a weak hash table. I don't see how
storing unique keys which map to real objects through a weak hash table
is simpler than storing pointers to these objects directly. Except that
some static type systems might not like the latter. It's definitely less
efficient - not only because of the indirection and lookup, but because
of weak references with finalizers.

Marcin 'Qrczak' Kowalczyk

unread,
Oct 1, 2005, 4:16:38 PM10/1/05
to
Joachim Durchholz <j...@durchholz.org> writes:

> The best I can do is to compare with projects I have been involved
> with - and it never occurred that there was a *need* to handle errors
> differently.

Perhaps this was accidental: I anticipated more errors to be fatal
when I was designing that, and it turned out that in most cases it's
easy to return some dummy value (or a designated value for an error)
and continue.

I just listed all cases where my code did catch specific exceptions
(i.e. excluding generic constructs which catch exceptions around some
body but do the same thing for any exception). Even if some of them
could be avoided by structuring the program differently, I prefer to
be less constrained and have more options available.

> You need an exception trace, i.e. a list of all functions that the
> exception propagated through when it was caught.
> The exception trace is almost identical to a stack trace most of the
> time, though there can be complications if another exception occurs
> during exception handling, or if code handles an exception and
> throws another one (in the latter case, it's still useful to carry
> information on the original exception around).

In my implementation the stack is physically unwound when an exception
handler finishes, not when it starts. This means that if an exception
handler throws another exception (or rethrows the same exception),
then the stack trace will include the original trace which led to the
previous exception, plus the trace inside the exception handler, plus
anything after that.

The trace is materialized as an object from scanned stack only when
needed, usually after an exception escapes the toplevel.

> However, this exception trace is just debugging information and can
> be carried in string (text) form. The software isn't supposed to
> base decisions on the contents of the exception trace.

Indeed it's suitable only for ultimately showing it to a human.
I don't want programs to make decisions basing on the stack trace.
But I do want - basing on the exception itself, which means it should
be something more than a string.

> Or the problem is something that might happen. In that case, throwing
> an exception isn't a good solution - you might be aborting all kinds
> of higher-level transactions you don't know about.

They are responsible for their cleanup in case the code they call
throws an exception. This is usually done by bracketing constructs
which guarantee that something is done no matter whether the given
code finishes normally or fails.

If exceptions are supported at all, no matter whether they are
machine-readable objects or text messages, then code must be prepared
for proper cleanup if they appear. Especially in a dynamically typed
language where various problems are not caught at compile time but
show up at runtime as exceptions.

> Or, the higher-level transactions would have to write exception
> handlers to make sure that their transactions are completed, and
> they'd have to inspect any exception that percolates up and check
> whether it's a failure or something they're supposed to handle.
> Which means they have to know what kinds of exceptions might come
> from the bowels of the lower layers,

No, they only need to know the kinds of exceptions they are supposed
to handle.

Dirk Thierbach

unread,
Oct 1, 2005, 6:39:10 PM10/1/05
to
Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> wrote:
> This means that it would have to be a weak hash table.

For example.

> I don't see how storing unique keys which map to real objects
> through a weak hash table is simpler than storing pointers to these
> objects directly.

It isn't. But it's an easy alternative which is not *much* more
difficult, when the language does not support some form of dynamic
types, and you don't like to "reinvent" that for some reason.

It's definitely better than going through some complicated other
tricks, which on top won't allow you to free ressources other than
memory.

> Except that some static type systems might not like the latter.

Exactly.

> It's definitely less efficient - not only because of the indirection
> and lookup, but because of weak references with finalizers.

Yes, but efficiency doesn't play a big role here -- you're working in
an interpreted language anyway, *and* you have to check the dynamic
type of each argument that the library routine gets. One extra lookup
doesn't hurt. The weak references shouldn't hurt so much either, and
if you really don't want them, you can always add a routine to
explicitely free the ressources.

The main point is still: One can normally simulate the dynamic types
of a specific language with a single not overly complicated algebraic
datatype. Because that's how the data is treated in the dynamically
typed language in the first place.

- Dirk


Marcin 'Qrczak' Kowalczyk

unread,
Oct 2, 2005, 4:50:51 AM10/2/05
to
Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

> It's definitely better than going through some complicated other
> tricks, which on top won't allow you to free ressources other than
> memory.

Those other tricks bring absolutely no problems with freeing resources.
They were needed only because of the type system; for the GC the
representation of objects was straightforward.

Unlike your solution, which introduces indirection using features
which have special behavior wrt. GC.

> Yes, but efficiency doesn't play a big role here -- you're working in
> an interpreted language anyway, *and* you have to check the dynamic
> type of each argument that the library routine gets. One extra lookup
> doesn't hurt.

I think it does. That it's already quite slow doesn't mean that an
extra dictionary lookup for each object access won't be noticeable.
It's needed only because of the type system. It clearly shows a
deficiency in this type system. My OCaml solution was ugly but at
least it didn't perform unnecessary runtime operations; it was ugly
only wrt. typing.

Yet earlier I made a compiler which generates OCaml source, so it
didn't have interpreting overhead. It has the same issues for object
representation.

> The main point is still: One can normally simulate the dynamic types
> of a specific language with a single not overly complicated algebraic
> datatype. Because that's how the data is treated in the dynamically
> typed language in the first place.

Having a central weak dictionary of all object is not intrinsic to
dynamic typing. My OCaml solution was much closer to how it would be
done in assembler.

Now I have a compiler instead of an interpreter, and it generates C
code using a very similar technique to that used in OCaml, except that
in C it's not controversial.

Panu Kalliokoski

unread,
Oct 2, 2005, 5:15:21 AM10/2/05
to
In <87slvmg...@qrnik.zagroda> Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> writes:
>> But they are then IMO semantically equivalent to
>> Lisp-with-a-sufficiently-smart-compiler. That is, dynamically typed,
>> period.
>At the same time it's more static than typical dynamically typed
>languages: the compiler will catch lots of type errors at compile
>time in areas where the programming style used doesn't rely on that
>"dynamic" typing, while in a true dynamically typed language a wrong
>type is very rarely caught at compile time.

Okay, I understand this from a practical standpoint... but if we're
talking about properties of languages, we shouldn't confuse them with
the properties of their implementations, and thus describing the
dynamicity of a language in terms what the compiler typically does (as
opposed to what the compiler could do) seems problematic to me. Maybe
we could separately define the concept of dynamicity of the act of
compiling a program, though...

>>>- Making an operation declared for an open family for cases, which
>>> is then defined incrementally for each case, and dispatched when
>>> applied.
>> Maybe I'm dizzy today but I'm not sure what you're talking about.
>> Maybe a concrete example would help?
>I was implementing an interpreter of a dynamically typed language
>in OCaml, and the problem was in representing runtime values of the
>embedded language in the host language. Requirements:

Let's see if I can follow.

>- Additional modules can wrap new types from the host language without
> having to update a central type definition.

This outrules Ocaml's discrete unions.

>- When wrapping a type from the host language, it must be possible to
> dynamically check whether the given wrapped value was constructed
> from the given type, and if so, unwrap to give access to the
> underlying value of the host language.

This outrules any "objective" solution, be it based on Ocaml objects or
data structures with functions within.

>- In the embedded language you can attempt to apply a value to
> argumenets. This may succeed not only for "functions", but also for
> various other wrapped types. When wrapping a type, it is decided how
> objects of this type implement application to arguments.

Ah, okay! This last requirement was the one I didn't gather from your
short description. Because the only thing you can do with a polymorphic
variant is to check whether it is a known case, yes, the dispatch of
this becomes O(n) WRT number of types. Not "impossible" in any way, but
ugly, yes.

But wait! Could you not express every value as a tuple of (value list ->
value * conretevalue), where the first part is the value's application
method and the second part is the concrete value as a polymorphic
variant? (This technique is much like the idea of expressing values of
lambda calculus as (possibly tag-wrapped) functions from the same type
to the same type.)

> Getting a record field is unified with application to a symbol,
> so it's used for all types which want to behave like records.

I don't know what it means in your language to "behave like records",
but once the application dispatch has been done, I don't think this
poses much of extra difficulty.

>Then concrete types are represented by records with the first field of
>type 'a desc for some 'a, using Obj.magic to cast them to kobj and back,
>after checking for the expected typeObj (or redundant idOfType). This is
>again unsafe. Application of an object to arguments jumps straight to
>the application code in the descriptor, without pattern matching.

[...]


>If this was implemented in .NET, I would have two choices:

[...]


>They would be both safe, i.e. they wouldn't rely on constructs which
>have undefined behavior when misused like Obj.magic.

I'm not sure whether I understand what you mean by "safe"; is it safe to
write a (cond) without an else clause in Scheme?

Marcin 'Qrczak' Kowalczyk

unread,
Oct 2, 2005, 6:02:33 AM10/2/05
to
Panu Kalliokoski <ate...@sange.fi> writes:

> But wait! Could you not express every value as a tuple of (value list ->
> value * conretevalue), where the first part is the value's application
> method and the second part is the concrete value as a polymorphic
> variant?

Indeed, except that the last part must be exn or something truly
extensible. Polymorphic variants need to reflect all variants in
the type, so they can't be extended in a modular way.

>>They would be both safe, i.e. they wouldn't rely on constructs which
>>have undefined behavior when misused like Obj.magic.
>
> I'm not sure whether I understand what you mean by "safe"; is it
> safe to write a (cond) without an else clause in Scheme?

It doesn't have undefined behavior. It's defined to do nothing after
checking the tests, only the resulting value is unspecified (and in
portable code should better be ignored).

Generally R5RS indeed has made an unfortunate decision of declaring
all kinds of errors which could easily be checked to have undefined
behavior. They probably did it in order to not commit to some concrete
error catching mechanism.

Most languages don't have so wide undefined behavior. In OCaml there
are only a handful of cases which can cause the processor to start
executing arbitrary code or poking arbitrary memory: Obj.magic,
Obj.obj, unmarshalling, array accessing when bounds checking is
turned off, interfacing with C, and perhaps something else.

Joachim Durchholz

unread,
Oct 2, 2005, 10:34:10 AM10/2/05
to
Marcin 'Qrczak' Kowalczyk schrieb:

> Even if some of them
> could be avoided by structuring the program differently, I prefer to
> be less constrained and have more options available.

I'm not sure when the "have more options available" is valid and when it
isn't.
For an extreme example, "goto" gives you more design options, too, but
it's generally deprecated because it's too easy to use it in an
unstructured manner, and though it can be somewhat difficult to
eliminate goto from a program, and sometimes might even complicate
things, people generally agree that not having goto in programming
languages is a Good Idea.
I do suspect that associating a real data flow with exceptions is
similar to goto, though maybe with less serious consequences, and these
consequences definitely set in at much larger program sizes.

>>However, this exception trace is just debugging information and can
>>be carried in string (text) form. The software isn't supposed to
>>base decisions on the contents of the exception trace.
>
> Indeed it's suitable only for ultimately showing it to a human.
> I don't want programs to make decisions basing on the stack trace.
> But I do want - basing on the exception itself, which means it should
> be something more than a string.

OK - then that's where we disagree.

>>Or the problem is something that might happen. In that case, throwing
>>an exception isn't a good solution - you might be aborting all kinds
>>of higher-level transactions you don't know about.
>
> They are responsible for their cleanup in case the code they call
> throws an exception. This is usually done by bracketing constructs
> which guarantee that something is done no matter whether the given
> code finishes normally or fails.

Yup, the "try...catch...final...end" construct.

I always found it's a kind of mental exercise to figure out what should
be done in the "final" part and what should be done after the "end".

Though that's probably a problem with any exception mechanism, whether
it's associating data with an exception or not.

(The issue can be mostly avoided if the language has ACID properties.
I.e. if an exception is thrown, all state-changing activities are undone.
Of course that's not possible if interaction with the physical world is
involved, such as an ATM machine handing out money *g* - so that avenue
would reduce the problem by orders of magnitude, but it doesn't
eliminiate it fully.)

> If exceptions are supported at all, no matter whether they are
> machine-readable objects or text messages, then code must be prepared
> for proper cleanup if they appear. Especially in a dynamically typed
> language where various problems are not caught at compile time but
> show up at runtime as exceptions.

Agreed.

>>Or, the higher-level transactions would have to write exception
>>handlers to make sure that their transactions are completed, and
>>they'd have to inspect any exception that percolates up and check
>>whether it's a failure or something they're supposed to handle.
>>Which means they have to know what kinds of exceptions might come
>>from the bowels of the lower layers,
>
> No, they only need to know the kinds of exceptions they are supposed
> to handle.

Kinds of exceptions - sure.
But they'll get exception data from low-level layers. That data is at a
low abstraction level, and often not really interpretable unless enough
intermediate context is added. IOW intermediate-level code has to rewrap
the exception data.
Not sure that this is a real improvement over passing the data up
explicitly... the effort needed is roughly the same, but programmers
tend to put data wrapping in exception handlers on the back seat, which
means you'll get mostly useless data if the exception goes up more than
one abstraction level.

Whatever :-)

Regards,
Jo

Dirk Thierbach

unread,
Oct 2, 2005, 10:28:20 AM10/2/05
to
Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> wrote:
> Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

> Those other tricks bring absolutely no problems with freeing
> resources.

No, but they are ugly, and have problems with the static type system.
Hence, what I am trying to say is that there are alternatives that
have no problems with the static type system. That these alternatives
may have also disadvantages in some special situations (here: a
complicated opaque type, which shouldn't get converted, and just needs
memory as ressource) is not important in this respect.

>> Yes, but efficiency doesn't play a big role here -- you're working in
>> an interpreted language anyway, *and* you have to check the dynamic
>> type of each argument that the library routine gets. One extra lookup
>> doesn't hurt.

> I think it does. That it's already quite slow doesn't mean that an
> extra dictionary lookup for each object access won't be noticeable.

What percentage of the total time needed would you estimate?

> It's needed only because of the type system. It clearly shows a
> deficiency in this type system.

It's not a "deficiency". In static type systems, you can throw away
the type information at runtime. In dynamic type systems, you have
to keep this information, but then you cannot make any guarantees at
compile time. Pick your choice. You cannot have the cake and eat
it at the same time.

One can convert between both representations by adding runtime type
information based on the known static type in one direction, and by
inspecting dynamic types in a typecase-style statement in the other
direction (which is more or less what happens in Haskell's Dynamic
module), but that's about all one can do.

> Yet earlier I made a compiler which generates OCaml source, so it
> didn't have interpreting overhead.

But it still has to dispatch on the dynamic type.

>> The main point is still: One can normally simulate the dynamic types
>> of a specific language with a single not overly complicated algebraic
>> datatype. Because that's how the data is treated in the dynamically
>> typed language in the first place.

> Having a central weak dictionary of all object is not intrinsic to
> dynamic typing.

No. But what has that to do with my remark above?

And you don't use a *central* weak dictionary of *all* objects, you
use a dictionary in the *modules* when they need these special kind of
objects. That's different.

And people *do* use this technique in the real world. The imaging
libraries for Tcl, themselves written in C, do it, for example. With
no dramatic loss of efficiency. Files are natively implemented in Tcl
in the same way.

> Now I have a compiler instead of an interpreter, and it generates C
> code using a very similar technique to that used in OCaml, except that
> in C it's not controversial.

C is only glorified assembler, anyway, so it's much more suitable
as target language. And it has no in-built GC, so it's also not
important how things behave with respect to GC here.

- Dirk

Marcin 'Qrczak' Kowalczyk

unread,
Oct 2, 2005, 2:20:38 PM10/2/05
to
Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

>> Those other tricks bring absolutely no problems with freeing
>> resources.
>
> No, but they are ugly, and have problems with the static type system.

That's the fault of the type system. I know that when the type id says
that it's a socket, then the corresponding data will refer to a socket
object, but the type system doesn't allow me to express that. It
doesn't let me ask whether it's a socket either. I must jump through
hoops to hide the fact that this dynamic object refers to this object
of a varying type, just because the type is varying.

> What percentage of the total time needed would you estimate?

Hard to say, perhaps 20%. Also higher memory consumption: hash table
entries and weak references take extra space. Weak references include
some internal pointers to let the GC find them, and extra time is
spent during GC so it marks them as live or dead.

> It's not a "deficiency". In static type systems, you can throw away
> the type information at runtime. In dynamic type systems, you have
> to keep this information, but then you cannot make any guarantees at
> compile time. Pick your choice. You cannot have the cake and eat it
> at the same time.

What guarantees? In many OO-style type systems (Java, .NET) I have a
guarantee that the program will not poke at random addresses in memory
nor attempt to execute a random part of memory as code. In OCaml I
don't have this guarantee, and it's mostly because of lack of RTTI
(Obj.magic, marshalling, dynamic code loading).

When I make an error when injecting an object to a dynamic type and
tag it with a wrong type, in an OO-style type system I might get an
unexpected exception. In your encoding I might get an index which
corresponds to an unrelated object, which has worse consequences.

Just because an error will not be called a type error, or a bad index
will not be called a dangling pointer, doesn't make it safe and robust.

> And you don't use a *central* weak dictionary of *all* objects, you
> use a dictionary in the *modules* when they need these special kind
> of objects. That's different.

Ok, one dictionary per type. Not a big difference because there are as
many dictionaries as types, and the same total number of entries - one
per object.

Depending on how unique keys are created, this might actually double
or triple the number of live objects from the point of view of the
host language.

> And people *do* use this technique in the real world. The imaging
> libraries for Tcl, themselves written in C, do it, for example. With
> no dramatic loss of efficiency. Files are natively implemented in
> Tcl in the same way.

As far as I know Tcl, it has a brain-dead semantics which lacks
pointers, because it insists on everything being strings conceptually,
and must indeed emulate them by magic indices into central tables.
Tcl is a very bad example to follow.

Lisp, Scheme, Python, Perl, Ruby and Smalltalk don't have to represent
active objects with an indirection through a central table, and pointers
are an intrinsic part of their semantics. They are good examples.

Joe Hendrix

unread,
Oct 2, 2005, 3:12:34 PM10/2/05
to
Jon Harrop wrote:
>> There are also higher order type systems that show up in proof
>> assistents like Coq. I don't know of any major languages that use them.
>
> Any ideas why they aren't used in ordinary languages? I have found that I
> need to be quite familiar with the type system in order to make good use of
> ML. Perhaps their type systems are just too complicated for Jo-programmer.

I think the biggest problem is decidability. In a theorem proving
environment, one can use the prover to show type correctness. In an
ordinary language if the type system is not decidable, there needs to be
a simple strategy that almost always works and performs well.

>> I don't personally have significant enough experience to compare the
>> type system in C++ with those of Haskell or ML.
>
> IMHO, practical issues make C++'s type system unusable, as you said. In
> addition to the hideous syntax is appaulingly slow compile time. When I
> last used C++ I was seeing compile times of 1-24hrs for <10kLOC projects.

I haven't experienced compile times that slow, though it certainly seems
possible. A frustrating aspect of C++ code that uses templates is that
one must consider the compile time performance in addition to the usual
concerns on code correctness, maintainability, and runtime performance.
I have quite a bit of respect for what the STL designers are trying to
accomplish through templates -- much more interesting than what I've
seen in object oriented areas.

Dirk Thierbach

unread,
Oct 2, 2005, 3:54:05 PM10/2/05
to
Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> wrote:
> Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

>>> Those other tricks bring absolutely no problems with freeing
>>> resources.

>> No, but they are ugly, and have problems with the static type system.

> That's the fault of the type system. I know that when the type id says
> that it's a socket, then the corresponding data will refer to a socket
> object, but the type system doesn't allow me to express that.

Well, you're asking for a dynamic type in a static type system. Doesn't
work. Either do it differently, or use a dynamic type system. None
of this is surprising, and none is the "fault" of the static type
system.

> It doesn't let me ask whether it's a socket either. I must jump
> through hoops to hide the fact that this dynamic object refers to
> this object of a varying type, just because the type is varying.

Then don't jump through hoops, use a clean solution.

>> What percentage of the total time needed would you estimate?

> Hard to say, perhaps 20%.

I really doubt that. That would mean that the operation you're going
to do just takes four times as long as a hash-table lookup. Not
realistic for such a complicate data structure that you don't want
to convert it instead of wrapping.

I'd rather estimate something like less then 1/1000.

As I said, it's done it practice. Nobody complains about the efficiency.

>> It's not a "deficiency". In static type systems, you can throw away
>> the type information at runtime. In dynamic type systems, you have
>> to keep this information, but then you cannot make any guarantees at
>> compile time. Pick your choice. You cannot have the cake and eat it
>> at the same time.

> What guarantees? In many OO-style type systems (Java, .NET) I have a
> guarantee that the program will not poke at random addresses in memory
> nor attempt to execute a random part of memory as code.

This guarantee, unless of course you use parts of the system that
are explicetly declared as "unsafe, only use if you know what you're
doing".

> In OCaml I don't have this guarantee, and it's mostly because of
> lack of RTTI (Obj.magic, marshalling, dynamic code loading).

If you're complaining that OCaml out of the box doesn't have
support for dynamic types, then that's certainly correct. OTOH,
you don't have to use dynamic types in most cases (unless you're
stubborn and want to do it that way because you want to do it that
way). That's the point I am trying to make for some time now :-)

And if you really want dynamic types, there's Dynamic in Haskell,
and there's also a third party dynamic package for OCaml (which I
didn't have a look at yet, because I just don't need these things
normally).

> When I make an error when injecting an object to a dynamic type and
> tag it with a wrong type, in an OO-style type system I might get an
> unexpected exception. In your encoding I might get an index which
> corresponds to an unrelated object, which has worse consequences.

No. Of course you tag the opaque object with a unique type id, just
like an implementation of dynamic typed language would do. And all of
this is handled inside the library modules. The only thing you have to
pay attention to is that the modules get unique ids, which shouldn't
be too defficult.

> Just because an error will not be called a type error,

But it will be called a dynamic type error. Raised in the library,
right where it belongs.

>> And you don't use a *central* weak dictionary of *all* objects, you
>> use a dictionary in the *modules* when they need these special kind
>> of objects. That's different.

> Ok, one dictionary per type.

No. One dictionary (or even just array) per opaque type that for some
reason you don't want to convert into the internal representation of
the interpreted language. There are some examples where this is
useful (like for large images, to save memory), but not too many
I can think out of my head. Many libraries (say, a complex number
library) won't have to use that at all.

> Not a big difference because there are as many dictionaries as
> types, and the same total number of entries - one per object.

No. See above.

> Depending on how unique keys are created, this might actually double
> or triple the number of live objects from the point of view of the
> host language.

No. And, if the objects are really heavy-weight, you can always
ask for explicit deallocation, as I have already said.

> Lisp, Scheme, Python, Perl, Ruby and Smalltalk don't have to represent
> active objects with an indirection through a central table, and pointers
> are an intrinsic part of their semantics. They are good examples.

I remember vaguely that there is a Python library which also uses this
scheme, but cannot come with details at the moment. So let's look at
other examples. Many OS represent files as handles (just integers).
X represents windows by their id. Windows has handles all over the
place. All of them use tables for this kind of indirection, and quite
happily.

- Dirk

Panu Kalliokoski

unread,
Oct 3, 2005, 1:52:49 AM10/3/05
to
In <871x34u...@qrnik.zagroda> Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> writes:
>> But wait! Could you not express every value as a tuple of (value list ->
>> value * conretevalue), where the first part is the value's application
>> method and the second part is the concrete value as a polymorphic
>> variant?
>Indeed, except that the last part must be exn or something truly
>extensible. Polymorphic variants need to reflect all variants in
>the type, so they can't be extended in a modular way.

Are there actually places where you have to write the type open? Of
course, the types of the functions that handle these values have to have
a "can-have-at-most" clause, but that can be made open by extending
every pattern match with a | _ -> raise RuntimeTypeError ...

>Most languages don't have so wide undefined behavior. In OCaml there
>are only a handful of cases which can cause the processor to start
>executing arbitrary code or poking arbitrary memory: Obj.magic,
>Obj.obj, unmarshalling, array accessing when bounds checking is
>turned off, interfacing with C, and perhaps something else.

Actually I was not aware of Obj.magic and I didn't understand it
referred to a part of the standard library until now. Yet, I'm still
not sure about the distinction between "truly undefined behavior" and
"unspecified return value"; the former might be harder to debug in the
general case, but neither of them really prevents the programmer from
causing much harm by doing a mistake and violating the use conditions of
these features (such as casting an object to a type which it really is
not, or using the return value to determine whether a given file should
be deleted or not).

Panu Kalliokoski

unread,
Oct 3, 2005, 6:24:55 AM10/3/05
to
In <87zmpta...@qrnik.zagroda> Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> writes:
>> I think a program is statically typed when, and only when, it does
>> not have to check the type of any expression at run time.
>It's not that black & white. First, because there is no
>language-independent definition of a type.

Then let's try to devise one. Even though we can't get a totally
language-independent notion, we can probably make one that applies to a
lot of (programming) languages.

First off, types are generally a property of /values/ (in languages
where types are a property of variables, they are transitionally a
property of values, namely the property of those values that can be held
in a variable of some type). "Value" is not a language-independent
notion, but I think we can make do with the usual definition of
first-class citizenship.

I think that for the purposes of this discussion, it is not sufficient
to treat as types only those properties of values that a particular
language happens to call types. We should probably deem as a type
_any property that will make the values (that have it) suitable for
handling in some specific way_. "Suitable" here means that it is not an
error and will not cause unspecified behaviour to do so. This is vague,
but will probably make sense for many languages. (This definition will
probably make any set of values with a constructive definition a
"type".)

By /type checking/ we probably mean the act of constraining a value to
have a particular type. When done "statically", it usually means
rejecting the program if some of its expressions can be proven to
break this constraint. When done "dynamically", it usually means that
the program generates an error when a value is noted to lack a type it
should have (to carry out some operation on the value).

Under these definitions, there actually are programs that are statically
typed, even though the definition of type is broad enough to allow types
like (and (rational? n) (> n 3) (< n 78)). However, there clearly is
no language where every program is statically typed, so I retract the
second part of my original definition.

The ability to do dynamic type checking requires that values (of some
more generic type) contain enough information to deduce whether they
break the type constraint. This extra information was the reason for
the original intuition behind the similarity of RTTI and dynamic typing.
But because obviously we can find out about an integer, in C, whether it
is a natural number, we should probably seek for finer-grained
definition of dynamic typing.

My next suggestion is that a language is considered dynamically typed
iff from every value in it (whose type is known to be some arbitrarily
generic type) one can deduce whether it belongs to an arbitrarily
specific subtype. (Maybe this property of language should be called
"type safety" instead.)

But what, then, is a statically typed language? How about it being a
language where a value of a some specific type cannot be treated as a
value of a more generic type -- possibly because the language does not
have a concept for the more generic type? Under these definitions, Java
would be a language both static (it does not have a common runtime type
for all values) and dynamic (for every static type, the subtype can be
found out to an arbitrary degree).

Because every (semantically closed) subset of a language is also a
language, non-dynamic languages may have dynamic sublanguages. Maybe
the "size" of the dynamic sublanguage can serve as an intuitive
definition of the dynamicity of a language.

Now, let's try your examples in light of these defn's...

>For example in Perl if you
>consider only scalars, arrays, hashes, subroutines and IO handles,
>then it's statically typed. If you distinguish references blessed for
>a particular package as distinct types of scalars, then it's not.

I don't know Perl very well, but I think it's not statically typed
anyway, because there is a supertype for all types (the wildcard). But
maybe Perl gurus will educate me.

>Is checking the exn constructor in OCaml dynamic typing? I know OCaml
>doesn't call them types, but they correspond to types in other languages.

Probably not, because exceptions are not _values_, and so the types of
exceptions are not part of the type system of the language (if my
definition above is used).

>We could declare any feature to be statically typed merely by calling
>the thing it checks something other than the type. Any language becomes
>statically typed if we distinguish only one type for everything.

Interesting that you arrive in this definition, because that's my
criterion for a language being _not_ statically typed.

>Second, there is much between checking no types and checking all
>types. Is OCaml statically typed? Its implementation of = checks types

Under my definitions, yes, because Ocaml statically constrains the
types in programs and does not provide a way to coerce a value into a
common supertype.

>Is C statically typed? It doesn't give the programmer tools for
>checking the types, it only follows pointers, calls functions etc.

Under my definitions, yes, C is statically typed, and moreover, it's not
dynamically typed (or not "safe").

>Is working with void *
>and casting it to the expected type statically typed programming? The
>compiler doesn't have to check the type of any expression at runtime.

(Rephrasing as "is a program that casts void * to a more specific type
static?") Under my original definition of "statically typed program",
yes. (But I'm causing confusion by disconnecting the definitions of
"statically typed program" and "statically typed language".)

>Is C++ less statically typed than C? This would be counter-intuitive.

*I* don't have definitions for "more statically typed" and "less
statically typed".

>Yes, it provides some RTTI. But its type system is more expressive
>(templates, subtyping), so there are fewer cases when you have to
>step outside it and work with void *. It's not fair to call C more
>statically typed than C++.

It probably is fair, but then again, it is also fair to call C less safe
than C++.

>To be constructive, I think the distinction between mostly static
>typing and mostly dynamic typing is almost clear after we decide
>what to call the type, which is somewhat arbitrary. A language is
>statically typed if it routinely and predictably rejects programs
>which can't be proved to follow its typing rules. It's dynamically
>typed if it accepts unsure programs by default, and at most tries
>to warn when types look deeply suspicious.

To be pedantic, a language does not "reject" programs, it contains them
or does not do so. But this definition was the one I had most trouble
with initially, because in Lisp I *can* reject programs at compile time
by generating compile errors programmatically. You don't have to
connect static type checking to a language; you can write it as an
add-on. So I find little benefit in defining the typing of a language
in terms of the behaviour of its implementation.

>Perl feels dynamically typed when using it, and Java feels statically
>typed.

Probably because in Perl, you don't have to write type declarations, and
in Java, you must. But Java starts to feel quite dynamic, too, if you
declare all your variables as "Object" and use instanceof a lot. Should
it be deemed a language feature that Java programmers seldom do this?

>> Well, then, Lisp without use of type predicates is statically typed.
>It's not, because compilers don't reject programs merely because they
>can't prove them to use types consistently.

Would you consider a compiler that refuses to compile (* "foo" 3/5) to
not implement Lisp?

>> Besides, macros can be used to code-walk all invocations of
>> functions to check the type of the arguments at compile-time.
>I don't believe it's practical to do that. The original CL type system
>is not expressive enough, macros are given too little tools for
>manipulation of types (I think they can't even expand type aliases),

I just meant that you can build your own type system to CL.

>even have to have sensible static types. What is the type of COERCE?

Obviously, when you implement some specific type system, you're free to
reject programs that don't play by the rules.

Panu Kalliokoski

unread,
Oct 3, 2005, 6:27:06 AM10/3/05
to
In <dhlmlu$89o$1...@online.de> Joachim Durchholz <j...@durchholz.org> writes:

>Panu Kalliokoski schrieb:
>> I think a program is statically typed when, and only when,
>> it does not have to check the type of any expression at run time.
>It's a continuum.
>Some part of a program may be statically typed, and other parts may be
>dynamically typed.

Agreed. Every self-sufficient part of a program is a program.

Panu Kalliokoski

unread,
Oct 3, 2005, 7:00:12 AM10/3/05
to
In <87zmpta...@qrnik.zagroda> Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> writes:
>> I think a program is statically typed when, and only when, it does
>> not have to check the type of any expression at run time.
>It's not that black & white. First, because there is no
>language-independent definition of a type.

Then let's try to devise one. Even though we can't get a totally

>For example in Perl if you


>consider only scalars, arrays, hashes, subroutines and IO handles,
>then it's statically typed. If you distinguish references blessed for
>a particular package as distinct types of scalars, then it's not.

I don't know Perl very well, but I think it's not statically typed


anyway, because there is a supertype for all types (the wildcard). But
maybe Perl gurus will educate me.

>Is checking the exn constructor in OCaml dynamic typing? I know OCaml


>doesn't call them types, but they correspond to types in other languages.

Probably not, because exceptions are not _values_, and so the types of


exceptions are not part of the type system of the language (if my
definition above is used).

>We could declare any feature to be statically typed merely by calling


>the thing it checks something other than the type. Any language becomes
>statically typed if we distinguish only one type for everything.

Interesting that you arrive in this definition, because that's my


criterion for a language being _not_ statically typed.

>Second, there is much between checking no types and checking all


>types. Is OCaml statically typed? Its implementation of = checks types

Under my definitions, yes, because Ocaml statically constrains the


types in programs and does not provide a way to coerce a value into a
common supertype.

>Is C statically typed? It doesn't give the programmer tools for


>checking the types, it only follows pointers, calls functions etc.

Under my definitions, yes, C is statically typed, and moreover, it's not


dynamically typed (or not "safe").

>Is working with void *


>and casting it to the expected type statically typed programming? The
>compiler doesn't have to check the type of any expression at runtime.

(Rephrasing as "is a program that casts void * to a more specific type


static?") Under my original definition of "statically typed program",
yes. (But I'm causing confusion by disconnecting the definitions of
"statically typed program" and "statically typed language".)

>Is C++ less statically typed than C? This would be counter-intuitive.

*I* don't have definitions for "more statically typed" and "less
statically typed".

>Yes, it provides some RTTI. But its type system is more expressive


>(templates, subtyping), so there are fewer cases when you have to
>step outside it and work with void *. It's not fair to call C more
>statically typed than C++.

It probably is fair, but then again, it is also fair to call C less safe
than C++.

>To be constructive, I think the distinction between mostly static
>typing and mostly dynamic typing is almost clear after we decide
>what to call the type, which is somewhat arbitrary. A language is
>statically typed if it routinely and predictably rejects programs
>which can't be proved to follow its typing rules. It's dynamically
>typed if it accepts unsure programs by default, and at most tries
>to warn when types look deeply suspicious.

To be pedantic, a language does not "reject" programs, it contains them


or does not do so. But this definition was the one I had most trouble
with initially, because in Lisp I *can* reject programs at compile time
by generating compile errors programmatically. You don't have to
connect static type checking to a language; you can write it as an
add-on. So I find little benefit in defining the typing of a language
in terms of the behaviour of its implementation.

>Perl feels dynamically typed when using it, and Java feels statically
>typed.

Probably because in Perl, you don't have to write type declarations, and


in Java, you must. But Java starts to feel quite dynamic, too, if you
declare all your variables as "Object" and use instanceof a lot. Should
it be deemed a language feature that Java programmers seldom do this?

>> Well, then, Lisp without use of type predicates is statically typed.


>It's not, because compilers don't reject programs merely because they
>can't prove them to use types consistently.

Would you consider a compiler that refuses to compile (* "foo" 3/5) to
not implement Lisp?

>> Besides, macros can be used to code-walk all invocations of


>> functions to check the type of the arguments at compile-time.
>I don't believe it's practical to do that. The original CL type system
>is not expressive enough, macros are given too little tools for
>manipulation of types (I think they can't even expand type aliases),

I just meant that you can build your own type system to CL.

>even have to have sensible static types. What is the type of COERCE?

Obviously, when you implement some specific type system, you're free to


reject programs that don't play by the rules.

Panu

Marcin 'Qrczak' Kowalczyk

unread,
Oct 3, 2005, 8:12:32 AM10/3/05
to
Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

> Well, you're asking for a dynamic type in a static type system.
> Doesn't work. Either do it differently, or use a dynamic type
> system. None of this is surprising, and none is the "fault" of
> the static type system.

If it had RTTI, which needs little changes in the type system but
big changes in object representation, then it would support seamless
integration of static and dynamic typing.

> No. One dictionary (or even just array) per opaque type that
> for some reason you don't want to convert into the internal
> representation of the interpreted language.

So you are suggesting a two-stage dynamic dispatch, like I had two
versions of my translator before the current one: first a finite
number of cases of the "internal" representation, then the last case
for types which are treated as "foreign".

I changed it because I didn't like using two entirely different
mechanisms for the same thing (dynamic dispatch on types). Now there
is no distinction between internal and foreign representations.
There is just an open set of dynamic types, both on the conceptual
level and in the implementation.

(Actually my current implementation unboxes a subset of integers.
It's only an implementation detail though.)

> I remember vaguely that there is a Python library which also uses
> this scheme, but cannot come with details at the moment. So let's
> look at other examples. Many OS represent files as handles (just
> integers). X represents windows by their id. Windows has handles all
> over the place. All of them use tables for this kind of indirection,
> and quite happily.

When passing data between processes or between a process and the OS,
you can't use pointers, so they have to be represented using indices
into central dictionaries. It's unavoidable. And it causes problems
with memory management (distributed GC is hard even conceptually).

It should not be needed for referring to objects within the same
process, when there is one address space and there are no security
issues with using pointers.

Marcin 'Qrczak' Kowalczyk

unread,
Oct 3, 2005, 8:50:49 AM10/3/05
to
Panu Kalliokoski <ate...@sange.fi> writes:

>>Indeed, except that the last part must be exn or something truly
>>extensible. Polymorphic variants need to reflect all variants in
>>the type, so they can't be extended in a modular way.
>
> Are there actually places where you have to write the type open?
> Of course, the types of the functions that handle these values have
> to have a "can-have-at-most" clause, but that can be made open by
> extending every pattern match with a | _ -> raise RuntimeTypeError
> ...

Adding this pattern at the end doesn't allow to extend the function
from outside. An open set of cases (or subtypes) means two things:
- The ability to add new cases.
- The ability to extend old dispatching functions to cover the added
cases.

Perhaps I'm biased because only certain dynamically typed languages
provide open dispatch, especially one which is not done by putting
functions inside objects or classes. Others don't find it valuable,
or even don't allow to define new types at all, and have e.g. the
interpretation of equality and of collection operations hardwired
for a fixed set of builtin types.

> Yet, I'm still not sure about the distinction between "truly
> undefined behavior" and "unspecified return value";

Undefined behavior is always a bug in the program. A correct program
will not invoke it under any circumstances.

Unspecified return value in Scheme is safe as long as the value is
ignored, and it's unavoidable: many essential constructs return it.

Marcin 'Qrczak' Kowalczyk

unread,
Oct 3, 2005, 10:07:51 AM10/3/05
to
Panu Kalliokoski <ate...@sange.fi> writes:

>>It's not that black & white. First, because there is no
>>language-independent definition of a type.
>
> Then let's try to devise one.

It will involve an arbitrary decision per language. It's impossible
to invent a universal unambiguous formulation which would agree with
the intuiton, especially since sometimes the intuition doesn't tell
anything.

> "Value" is not a language-independent notion, but I think we can
> make do with the usual definition of first-class citizenship.

"Value" is much more clear across languages than "type". I think we
can agree that each language comes with a reasonable definition of
"value", which is usually what expressions evaluate to (when they
don't fail, and unless the expression evaluates to several values)
if the language has expressions at all (Prolog and Forth also have
values but they are not values of expressions).

> I think that for the purposes of this discussion, it is not sufficient
> to treat as types only those properties of values that a particular
> language happens to call types.

Sure. Usually it agrees though.

In Python it doesn't agree with something that I think most people
would consider types in the general sense. I'm even not sure what is
the actual Python definition; it has been changing recently:

Python 2.4.1 (#1, Apr 30 2005, 17:02:19)
[GCC 4.0.0 (PLD Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> type(3)
<type 'int'>
>>> type(type(3))
<type 'type'>
>>> class C1: pass
...
>>> type(C1())
<type 'instance'>
>>> class C2: pass
...
>>> type(C1()) == type(C2())
True
>>> class C3(object): pass
...
>>> type(C3())
<class '__main__.C3'>
>>> type(C3)
<type 'type'>
>>> type(C1)
<type 'classobj'>

It seems that instances of all so called classic classes have the
same type, but for new-style classes (which inherit from at least
one new-style class) the type is the same as the class. Moreover,
new-style classes are types themselves, but classic classes aren't.
So Python's definition of "type" is quite irregular.

Probably a generic definition, when applied to Python, should treat
different classic classes as distinct types.

> We should probably deem as a type _any property that will make the
> values (that have it) suitable for handling in some specific way_.

This means any property at all, it's too broad.

> My next suggestion is that a language is considered dynamically typed
> iff from every value in it (whose type is known to be some arbitrarily
> generic type) one can deduce whether it belongs to an arbitrarily
> specific subtype.

I would prefer to call them "supporting dynamic typing" or something,
because it doesn't rule out static type checking too, while "dynamically
typed" generally implies lack of systematic static typechecking.

> But what, then, is a statically typed language? How about it being a
> language where a value of a some specific type cannot be treated as a
> value of a more generic type -- possibly because the language does not
> have a concept for the more generic type?

Then Java would be statically typed (because primitive types like int
don't have subtypes) but C# would not (because object is a supertype
of all types, including int).

And by the way, do you mean that *no* values can be treated as of a
supertype? Then OCaml is not statically typed because it has subtyping
among object types and polymorphic variants. Or do you mean that there
*exist* values which cannot be treated as of a supertype? Then Ruby
is statically typed, because Object.new is an instance of Object which
has no supertypes.

> I don't know Perl very well, but I think it's not statically typed
> anyway, because there is a supertype for all types (the wildcard).

I think it can be thought of as a Lisp symbol: it refers to the
internal object which has slots for a few global things sharing the
same name. It's not a supertype: an array can be referred to from
a wildcard only if it's named with a global name.

>>Is checking the exn constructor in OCaml dynamic typing? I know OCaml
>>doesn't call them types, but they correspond to types in other languages.
>
> Probably not, because exceptions are not _values_, and so the types
> of exceptions are not part of the type system of the language (if my
> definition above is used).

I don't understand. OCaml exceptions are definitely values.

When you want to introduce a new kind of exception, in OCaml you make
a new constructor of the exn type (at least this is how OCaml calls it),
but in many other languages you make a new type. I'm not convinced that
differences between these two mechanisms are significant enough that we
should consider the former as the same type and the latter as separate
types.

I'm not convinced that there exists a sensible cross-language
definition of a type.

>>To be constructive, I think the distinction between mostly static
>>typing and mostly dynamic typing is almost clear after we decide
>>what to call the type, which is somewhat arbitrary. A language is
>>statically typed if it routinely and predictably rejects programs
>>which can't be proved to follow its typing rules. It's dynamically
>>typed if it accepts unsure programs by default, and at most tries
>>to warn when types look deeply suspicious.
>
> To be pedantic, a language does not "reject" programs, it contains
> them or does not do so. But this definition was the one I had most
> trouble with initially, because in Lisp I *can* reject programs at
> compile time by generating compile errors programmatically.

That's why I said "routinely and predictably". In Lisp the degree of
detection varies with the sophistication of the implementation, it's
not a part of the language definition. The "type checker" can be
fooled into accepting the program by rewriting the program in trivial
ways. It's very different from truly static type checkers which reject
when they are not sure.

Lisp compilers aren't allowed to reject these programs which would
try to add two strings. They only warn, because the language itself
permits using these constructs as long as they are not actually
executed:

$ sbcl
This is SBCL 0.9.0, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
* (defun f (x) (if x (+ x "a")))
; in: LAMBDA NIL
; (+ X "a")
;
; note: deleting unreachable code
;
; caught WARNING:
; Asserted type NUMBER conflicts with derived type
; (VALUES (SIMPLE-ARRAY CHARACTER (1)) &OPTIONAL).
; See also:
; The SBCL Manual, Node "Handling of Types"
;
; compilation unit finished
; caught 1 WARNING condition
; printed 1 note

F
* (f nil)

NIL

In my language Kogut I was careful to distinguish errors which are
supposed to be always caught (e.g. referring to a non-existent
variable) and warnings which just happened to be sometimes found by
the implementation statically (e.g. applying a function to a wrong
number of arguments). These warnings do cause the compiler to exit
with a non-zero status, because they are almost surely a bug, but
after producing an object file or executable, because strictly
speaking the program is statically valid.

>>Perl feels dynamically typed when using it, and Java feels
>>statically typed.
>
> Probably because in Perl, you don't have to write type declarations,
> and in Java, you must.

Some would say that Perl distinguishes $ @ % & prefixes in variables,
it's equivalent to type declarations.

It doesn't statically distinguish references to different types
from each other though (nor from strings and numbers). I mean both
distinguishing references to scalars, to arrays etc., and also
distinguishing references blessed to different packages. That's why
it should be considered dynamically typed.

The set of builtin types is so small that you generally encode what
would be different types in other languages by different values, e.g.
having dictionaries with a distinguished key which serves for checking
the nature of the entity it represents. It might become similar to
emulating a dynamic type system within a fixed set of static types.
It's all pretty foggy.

> Would you consider a compiler that refuses to compile (* "foo" 3/5)
> to not implement Lisp?

Yes, especially if it's hidden in unreachable code.

Dirk Thierbach

unread,
Oct 3, 2005, 10:30:12 AM10/3/05
to
Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> wrote:
> Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

> If it had RTTI, which needs little changes in the type system but
> big changes in object representation,

And probably big changes in the GC -- I don't know any details how
OCaml handles GC, but AFAIK one argument against using Obj.magic is
that the GC doesn't work properly for the converted value, if it
is sufficiently different (someone who knows more about this
please correct me if I am wrong).

So your wish to use dynamic types together with automatic GC just
runs into trouble at this point. But that's not necessarily a
"deficit" of the static type system.

> then it would support seamless integration of static and dynamic
> typing.

As I said, there's a third-party package in development. Maybe it's
already good enough.

>> No. One dictionary (or even just array) per opaque type that
>> for some reason you don't want to convert into the internal
>> representation of the interpreted language.

> So you are suggesting a two-stage dynamic dispatch,

No. It all happens at the same stage. If this is really so difficult,
let's make an example, starting from yours (I didn't check against syntax
errors and other stupid mistakes):

type ktype = int (* Unique tag *)

type obj =
| KFunction of (trace -> obj list -> obj list)
| KInt of int
| KBigInt of Big_int.big_int
... etc. for various primitive types ...
| KList of obj list (* Lisp style lists *)
...
| KOpaque of ktype * obj
| KHandle of ktype * int

Now a library for complex numbers can do something like

module Complex

let kcomplex : ktype = ...

let add x y = match (x, y) with
| (KOpaque (xt, KList [KInt xreal; KInt xim]),
KOpaque (yt, KList [KInt yreal; KInt yim]))
when xt = kcomplex and yt = kcomplex ->
KOpaque (kcomplex, KList [KInt (xreal + yreal); KInt (xim + yim)])
| _ -> failwith "Type error"

An imaging library does something like

module Image

let lookup handle = ...

let kimage : ktype = ...

let fill_image img color = match (img, color) with
| (KHandle (t, h), KList [KInt r; KInt g; KInt b]) when t = kimage ->
let img = lookup h in ... fill image with r/g/b ...
| _ -> failwith "Type error"

Nice, clean, simple, and extensible; and the only "second stage" is
the lookup, which takes next to no time for an array or hash.

> When passing data between processes or between a process and the OS,
> you can't use pointers, so they have to be represented using indices
> into central dictionaries. It's unavoidable. And it causes problems
> with memory management (distributed GC is hard even conceptually).

Yes. But that's not the point: The point is that efficiency cannot
be so bad, if this technique is used everywhere. If you want it to
work together with the native GC, you need the proper hooks. But that's
an issue of GC implementation, not of static typing.

- Dirk

Marcin 'Qrczak' Kowalczyk

unread,
Oct 3, 2005, 12:07:24 PM10/3/05
to
Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

>> If it had RTTI, which needs little changes in the type system but
>> big changes in object representation,
>
> And probably big changes in the GC

Well, the implementation of the GC depends on object representation,
but it's not intrinsically harder because of RTTI.

> AFAIK one argument against using Obj.magic is that the GC doesn't
> work properly for the converted value, if it is sufficiently
> different

For example OCaml generates a different code for updating a reference
or array element, depending on whether it's sure that the value is an
immediate value or whether it could possibly be a pointer. When the
reference and the value are Obj.magic'd into a type which consists
only of non-pointers while in reality they are pointers, a wrong code
would be generated for assignment.

It's not a problem with RTTI itself but with fooling the compiler that
something has a different type than it really does, together with the
desire to perform optimizations basing on type information. If the
compiler honestly saw that the value can have any type, it would just
use the more general implementation.

> | (KOpaque (xt, KList [KInt xreal; KInt xim]),
> KOpaque (yt, KList [KInt yreal; KInt yim]))

Ok, it works for values which can be reasonably efficiently built
using the algebra of existing constructors. Doesn't work for types
which should be implemented in lower level for efficiency, unless
"all" of them are prepared in advance: floats, strings, bit arrays,
byte arrays etc. Requires even primitive types to accept the overhead
of dynamic typechecking of their contents.

Objects which don't merely hold data but are opaque to the host
language must be represented indirectly: thread handles, mutexes,
sockets, weak references, wrapped objects from other languages,
zlib compressor state, Mersenne Twister's random state, iconv
conversion descriptors, database connection handles etc.

Unless we put everything in one giant type definition of course.
This would prevent making external modules which add more types.

> Yes. But that's not the point: The point is that efficiency cannot
> be so bad, if this technique is used everywhere.

It's not "everywhere", only when communicating with the world external
to the process where it's unavoidable. People expect this to be less
efficient than local computation, libraries use buffering and caching
to reduce the amount of physical communication.

Panu Kalliokoski

unread,
Oct 3, 2005, 2:05:06 PM10/3/05
to
In <87zmpta...@qrnik.zagroda> Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> writes:
>> I think a program is statically typed when, and only when, it does
>> not have to check the type of any expression at run time.
>It's not that black & white. First, because there is no
>language-independent definition of a type.

Then let's try to devise one. Even though we can't get a totally


language-independent notion, we can probably make one that applies to a
lot of (programming) languages.

First off, types are generally a property of /values/ (in languages
where types are a property of variables, they are transitionally a
property of values, namely the property of those values that can be held

in a variable of some type). "Value" is not a language-independent


notion, but I think we can make do with the usual definition of
first-class citizenship.

I think that for the purposes of this discussion, it is not sufficient


to treat as types only those properties of values that a particular

language happens to call types. We should probably deem as a type


_any property that will make the values (that have it) suitable for

handling in some specific way_. "Suitable" here means that it is not an
error and will not cause unspecified behaviour to do so. This is vague,
but will probably make sense for many languages. (This definition will
probably make any set of values with a constructive definition a
"type".)

By /type checking/ we probably mean the act of constraining a value to
have a particular type. When done "statically", it usually means
rejecting the program if some of its expressions can be proven to
break this constraint. When done "dynamically", it usually means that
the program generates an error when a value is noted to lack a type it
should have (to carry out some operation on the value).

Under these definitions, there actually are programs that are statically
typed, even though the definition of type is broad enough to allow types
like (and (rational? n) (> n 3) (< n 78)). However, there clearly is
no language where every program is statically typed, so I retract the
second part of my original definition.

The ability to do dynamic type checking requires that values (of some
more generic type) contain enough information to deduce whether they
break the type constraint. This extra information was the reason for
the original intuition behind the similarity of RTTI and dynamic typing.
But because obviously we can find out about an integer, in C, whether it
is a natural number, we should probably seek for finer-grained
definition of dynamic typing.

My next suggestion is that a language is considered dynamically typed


iff from every value in it (whose type is known to be some arbitrarily
generic type) one can deduce whether it belongs to an arbitrarily

specific subtype. (Maybe this property of language should be called
"type safety" instead.)

But what, then, is a statically typed language? How about it being a


language where a value of a some specific type cannot be treated as a
value of a more generic type -- possibly because the language does not

have a concept for the more generic type? Under these definitions, Java
would be a language both static (it does not have a common runtime type
for all values) and dynamic (for every static type, the subtype can be
found out to an arbitrary degree).

Because every (semantically closed) subset of a language is also a
language, non-dynamic languages may have dynamic sublanguages. Maybe
the "size" of the dynamic sublanguage can serve as an intuitive
definition of the dynamicity of a language.

Now, let's try your examples in light of these defn's...

>For example in Perl if you
>consider only scalars, arrays, hashes, subroutines and IO handles,
>then it's statically typed. If you distinguish references blessed for
>a particular package as distinct types of scalars, then it's not.

I don't know Perl very well, but I think it's not statically typed


anyway, because there is a supertype for all types (the wildcard). But
maybe Perl gurus will educate me.

>Is checking the exn constructor in OCaml dynamic typing? I know OCaml


>doesn't call them types, but they correspond to types in other languages.

Probably not, because exceptions are not _values_, and so the types of
exceptions are not part of the type system of the language (if my
definition above is used).

>We could declare any feature to be statically typed merely by calling

>To be constructive, I think the distinction between mostly static


>typing and mostly dynamic typing is almost clear after we decide
>what to call the type, which is somewhat arbitrary. A language is
>statically typed if it routinely and predictably rejects programs
>which can't be proved to follow its typing rules. It's dynamically
>typed if it accepts unsure programs by default, and at most tries
>to warn when types look deeply suspicious.

To be pedantic, a language does not "reject" programs, it contains them
or does not do so. But this definition was the one I had most trouble
with initially, because in Lisp I *can* reject programs at compile time

by generating compile errors programmatically. You don't have to
connect static type checking to a language; you can write it as an
add-on. So I find little benefit in defining the typing of a language
in terms of the behaviour of its implementation.

>Perl feels dynamically typed when using it, and Java feels statically
>typed.

Probably because in Perl, you don't have to write type declarations, and

in Java, you must. But Java starts to feel quite dynamic, too, if you
declare all your variables as "Object" and use instanceof a lot. Should
it be deemed a language feature that Java programmers seldom do this?

>> Well, then, Lisp without use of type predicates is statically typed.
>It's not, because compilers don't reject programs merely because they
>can't prove them to use types consistently.

Would you consider a compiler that refuses to compile (* "foo" 3/5) to
not implement Lisp?

>> Besides, macros can be used to code-walk all invocations of

Panu Kalliokoski

unread,
Oct 3, 2005, 3:20:41 PM10/3/05
to
In <87oe66s...@qrnik.zagroda> Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> writes:
>>>Indeed, except that the last part must be exn or something truly
>>>extensible. Polymorphic variants need to reflect all variants in
>>>the type, so they can't be extended in a modular way.
>> Are there actually places where you have to write the type open?
>> Of course, the types of the functions that handle these values have
>> to have a "can-have-at-most" clause, but that can be made open by
>> extending every pattern match with a | _ -> raise RuntimeTypeError
>> ...
>Adding this pattern at the end doesn't allow to extend the function
>from outside. An open set of cases (or subtypes) means two things:
>- The ability to add new cases.
>- The ability to extend old dispatching functions to cover the added
> cases.

Did I still miss some requirements? I thought the only things needed
were (1) unwrapping values, (2) application of arbitrary objects to
arguments, (3) the ability to add new types with their own operations
and application rule. Is there still the requirement of adding new
behavior to existing operations? This troubles me because it seems to
me to be equally cumbersome to do in languages with RTTI: for efficient
and extensible type dispatch, one should be able to implement some kind
of search tree for arguments of an operation. But you can't do that
without first-class types...

Anyway, if linear time is OK, one can extend the existing operations
like this (I think):

let myopref = ref
function `TypeA x -> dosomething | _ -> raise RuntimeTypeError

let myop x = (!myopref) x

[... in another part ...]

let oldop = !myopref in
myopref := function `TypeB x -> dosomethingelse | x -> oldop x

>Perhaps I'm biased because only certain dynamically typed languages
>provide open dispatch, especially one which is not done by putting
>functions inside objects or classes. Others don't find it valuable,
>or even don't allow to define new types at all, and have e.g. the
>interpretation of equality and of collection operations hardwired
>for a fixed set of builtin types.

The first being CLOS, the second being Ocaml?

>> Yet, I'm still not sure about the distinction between "truly
>> undefined behavior" and "unspecified return value";
>Undefined behavior is always a bug in the program. A correct program
>will not invoke it under any circumstances.
>Unspecified return value in Scheme is safe as long as the value is
>ignored, and it's unavoidable: many essential constructs return it.

I expressed myself unclearly. What's the difference between triggering
a situation with "undefined behaviour" and using a value that is
"unspecified"? Both are violations of safety. How is the latter more
"safe" than the former?

Panu Kalliokoski

unread,
Oct 3, 2005, 3:25:33 PM10/3/05
to

Panu

Marcin 'Qrczak' Kowalczyk

unread,
Oct 3, 2005, 4:14:56 PM10/3/05
to
Panu Kalliokoski <ate...@sange.fi> writes:

>>Adding this pattern at the end doesn't allow to extend the function
>>from outside. An open set of cases (or subtypes) means two things:
>>- The ability to add new cases.
>>- The ability to extend old dispatching functions to cover the added
>> cases.
>
> Did I still miss some requirements? I thought the only things needed
> were (1) unwrapping values, (2) application of arbitrary objects to
> arguments, (3) the ability to add new types with their own operations
> and application rule. Is there still the requirement of adding new
> behavior to existing operations?

Sorry, I'm mixing up what I expected to be given by the host language
with what I was implementing in terms of it.

There is no single right set of features, but various sets which can
be used to implement or emulate other features, more or less easily.

Having a small fixed set of extensible operations is enough, because
one of them can be a dynamic type tag, which is then used to build
dictionaries for dispatch.

The problem with polymorphic variants is different. Even if the whole
runtime was keeping the type of objects polymorphic, open for more
variants, they could only be extended by instantiating the whole
runtime with an updated type, and obtaining the runtime working on a
type with more variants which includes the added operations for
working with the new type.

Every module which adds a new type must be a functor, and in one place
you string all extension functors together. And there the compiler
might complain about hash collisions of names of polymorphic variants!
They are designed such that the compiler sees a whole polymorphic
variant when it's ultimately instantiated.

Compare it to the exn version, or OO-style version, or dynamically
typed version, where a new variant can be added by running a
standalone module, which perhaps imperatively registers named
functions for the embedded language. It can even be dynamically
loaded.

> Anyway, if linear time is OK, one can extend the existing operations
> like this (I think):
>
> let myopref = ref
> function `TypeA x -> dosomething | _ -> raise RuntimeTypeError
>
> let myop x = (!myopref) x
>
> [... in another part ...]
>
> let oldop = !myopref in
> myopref := function `TypeB x -> dosomethingelse | x -> oldop x

OCaml will not allow doing that across modules. The type of the
polymorphic variant inside the reference must be fixed when the
module is closed:

"The type of this expression, (_[> `TypeA of '_a ] -> int) ref,
contains type variables that cannot be generalized"

> I expressed myself unclearly. What's the difference between
> triggering a situation with "undefined behaviour" and using a value
> that is "unspecified"? Both are violations of safety. How is the
> latter more "safe" than the former?

Ok, strictly speaking it's not, i.e. both require a more sophisticated
analysis to locate than finding some names used. I said I didn't like
the Scheme decision of leaving values of side-effect-only procedures
unspecified. It was you who brought the example of Scheme. I said that
the OO-style way of RTTI is safe.

Panu Kalliokoski

unread,
Oct 3, 2005, 4:47:58 PM10/3/05
to
In <873bni9...@qrnik.zagroda> Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> writes:

>> let oldop = !myopref in
>> myopref := function `TypeB x -> dosomethingelse | x -> oldop x
>OCaml will not allow doing that across modules. The type of the
>polymorphic variant inside the reference must be fixed when the
>module is closed:

Interesting. I didn't know this. Not that the static type system is
doing any good in this situation anymore anyway... I'm for adding a
"dynamic" type into Ocaml, and I've been surprised by the resistance to
do so; instead, all kinds of C-style "I know what I'm doing" loopholes
have been added.

But this particular case (extensible & efficient dispatch of operations)
really seems to me to require even more. I don't think first-class
types can be fitted in a static type system, though... or maybe as
inspection-only (no equivalent of COERCE)?

>Ok, strictly speaking it's not, i.e. both require a more sophisticated
>analysis to locate than finding some names used. I said I didn't like
>the Scheme decision of leaving values of side-effect-only procedures
>unspecified. It was you who brought the example of Scheme. I said that
>the OO-style way of RTTI is safe.

Yes, I wasn't just sure what kind of "unsafety" you were referring to
with Obj.magic. But it was as I suspected.

Panu Kalliokoski

unread,
Oct 4, 2005, 5:21:17 AM10/4/05
to
In <87k6gus...@qrnik.zagroda> Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> writes:
>>>It's not that black & white. First, because there is no
>>>language-independent definition of a type.
>> Then let's try to devise one.
>It will involve an arbitrary decision per language. It's impossible
>to invent a universal unambiguous formulation which would agree with
>the intuiton, especially since sometimes the intuition doesn't tell
>anything.

That's why hunting the definition is good even though no precise,
good-for-every-situation definition was found. Trying to come up with a
definition forces you to think what your intuition really is about. For
example, your definition is based on what a compiler for the language
will typically do or is required to do by the language definition. But
many statically typed languages do not actually require their compilers
to reject any program with type errors at compile time...

>> "Value" is not a language-independent notion, but I think we can
>> make do with the usual definition of first-class citizenship.
>"Value" is much more clear across languages than "type". I think we

Much clearer than what different languages _is called_ a type, but not
necessarily clearer than what _is_ a type in different languages.

>> I think that for the purposes of this discussion, it is not sufficient
>> to treat as types only those properties of values that a particular
>> language happens to call types.
>Sure. Usually it agrees though.

"it" = the way different languages call something "types"?

>> We should probably deem as a type _any property that will make the
>> values (that have it) suitable for handling in some specific way_.
>This means any property at all, it's too broad.

(I don't think it includes nonconstructive properties.) Too broad for
what? Do you see some specific problem coming? Or does it not match
your intuition, or what most languages (but not Common Lisp) call
"types"? For my purposes, this definition might be very good. Because
the word "type" has been defined similarly many times ("set of values",
"those values that can carry out some specific operations") the
definition might be very useful for other purposes, too.

If it pleases you, here is another attempt at a definition of what a
type is specifically in programming languages: any property that must be
known to perform any operation other than referring to the value or
testing the value for identity. If one wants GC-like properties to be
deemed "typeless", one can extend this with copying the value and
testing the value for structural equality.

>> My next suggestion is that a language is considered dynamically typed
>> iff from every value in it (whose type is known to be some arbitrarily
>> generic type) one can deduce whether it belongs to an arbitrarily
>> specific subtype.
>I would prefer to call them "supporting dynamic typing" or something,
>because it doesn't rule out static type checking too, while "dynamically
>typed" generally implies lack of systematic static typechecking.

Probably in your world, yes. I never thought of "dynamically typed" and
"statically typed" as mutually exclusive. Or course, good support for
the former usually alleviates the need for the latter, and vice versa.

>> But what, then, is a statically typed language? How about it being a
>> language where a value of a some specific type cannot be treated as a
>> value of a more generic type -- possibly because the language does not
>> have a concept for the more generic type?
>Then Java would be statically typed (because primitive types like int
>don't have subtypes) but C# would not (because object is a supertype
>of all types, including int).

Does C# have checked downcasts? If it has and if everything has a
common supertype in C#, it really is my intuition that C# is not
statically typed.

>And by the way, do you mean that *no* values can be treated as of a
>supertype?

I was thinking about casting values to the most generic type. I can see
that that's not what I wrote... :( This definition is not very good
anyway. Maybe I should go my original way and define a language to be
statically typed when, for every expression, it requires the programmer
to constrain the type of every value of that expression to be the known
-- with respect to other expressions, to account for type variables.

>> I don't know Perl very well, but I think it's not statically typed
>> anyway, because there is a supertype for all types (the wildcard).
>I think it can be thought of as a Lisp symbol: it refers to the
>internal object which has slots for a few global things sharing the
>same name. It's not a supertype: an array can be referred to from
>a wildcard only if it's named with a global name.

[...]


>Some would say that Perl distinguishes $ @ % & prefixes in variables,
>it's equivalent to type declarations.
>It doesn't statically distinguish references to different types
>from each other though (nor from strings and numbers). I mean both
>distinguishing references to scalars, to arrays etc., and also
>distinguishing references blessed to different packages. That's why
>it should be considered dynamically typed.

Interesting. References seem to be able to point to anything, and you
can find out about the type of the thing pointed to by the reference...
so Perl at least has dynamic typing, but it also seems to have
non-reference types that are statically typed. Quite similarly to Java,
actually.

Does Perl provide a way to distinguish between different types of
scalars? For instance, can I tell the string "ARRAY(0xdeadbeef)" from a
reference pointing to that location? If not, can I forge a reference by
making a string with those contents?

>The set of builtin types is so small that you generally encode what
>would be different types in other languages by different values, e.g.
>having dictionaries with a distinguished key which serves for checking
>the nature of the entity it represents. It might become similar to
>emulating a dynamic type system within a fixed set of static types.
>It's all pretty foggy.

That's why people talk about the "power" of a type system. But I
wouldn't say that a language with a more powerful static type system is
more statically typed... in either case, you can use some dynamic subset
of values (e.g. strings) (or, with your terminology, subset of values
with support for dynamic typing) to represent every value in your
application.

>>>Is checking the exn constructor in OCaml dynamic typing? I know OCaml
>>>doesn't call them types, but they correspond to types in other languages.
>> Probably not, because exceptions are not _values_, and so the types
>> of exceptions are not part of the type system of the language (if my
>> definition above is used).
>I don't understand. OCaml exceptions are definitely values.

I was wrong. Frankly, I have never used OCaml exceptions for other
purposes than throwing them... I remember noticing that force stores
exceptions, but I didn't know one can use them in ordinary pattern
matches.

Yes, the exn type is really dynamic, similar to Java's Object.

>> them or does not do so. But this definition was the one I had most
>> trouble with initially, because in Lisp I *can* reject programs at
>> compile time by generating compile errors programmatically.
>That's why I said "routinely and predictably". In Lisp the degree of
>detection varies with the sophistication of the implementation, it's
>not a part of the language definition.

But should we consider, for example, C interpreters "implementations" of
C? I think we should, even though they probably will not complain about
a type violation until runtime. If a language must, in order to be
statically typed, require its implementations to reject type-violating
programs before running them even partially, then a lot of languages
usually deemed statically typed aren't actually statically typed.

>The "type checker" can be
>fooled into accepting the program by rewriting the program in trivial
>ways.

I don't understand. If I generate compile errors programmatically, I
can be as strict as I want to be.

Anyway, thank you for pointing out interesting issues in Perl, Ocaml and
Common Lisp. I've learned more about those languages. Actually, I'm
surprised I didn't notice earlier that exception values can really be
used like totally open polymorphic variants. Perl and Common Lisp, on
the other hand, are not my "languages of choice", so it's interesting to
learn about their approaches to things.

Dirk Thierbach

unread,
Oct 4, 2005, 5:30:46 AM10/4/05
to
Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> wrote:
> Dirk Thierbach <dthie...@usenet.arcornews.de> writes:

>>> If it had RTTI, which needs little changes in the type system but
>>> big changes in object representation,

>> And probably big changes in the GC

> Well, the implementation of the GC depends on object representation,
> but it's not intrinsically harder because of RTTI.

No, of course not. But again, that's not the point. You want to
re-use the native GC, which is written for static types, for dynamic
types. That doesn't work (because it was not written for it, though
it easily could have been), and instead of blaming it on that, you
blame it on the static typing system. I think that's a bit unfair,
and that's what I am trying to point out.

> It's not a problem with RTTI itself but with fooling the compiler that
> something has a different type than it really does, together with the
> desire to perform optimizations basing on type information. If the
> compiler honestly saw that the value can have any type, it would just
> use the more general implementation.

Of course. But the static type system is there to allow the compiler
to take advantage of its knowledge about static types. Again, you
cannot have the cake and eat it at the same time. Either you have
static types, and the compiler can optimize, or you don't, and then
you'll have to check at runtime. You can convert between those
representations, and you can complain that OCaml doesn't offer the
second representation because it was not made with that in mind,
but you cannot have both at one.

[...]


> Requires even primitive types to accept the overhead of dynamic
> typechecking of their contents.

That's the price you pay for dynamic typing -- if you would implement
it in the dynamically typed language itself, you'd have to pay the
same price. Dynamic typing comes always with a loss of efficiency
for the gain of flexibility.

> Objects which don't merely hold data but are opaque to the host
> language must be represented indirectly: thread handles, mutexes,
> sockets, weak references, wrapped objects from other languages,
> zlib compressor state, Mersenne Twister's random state, iconv
> conversion descriptors, database connection handles etc.

Yes. But most of these would have to been handled specially, anyway.
Sockets for example require an explicit close, and cannot just
be GC'd (unless there are finalizers).

>> Yes. But that's not the point: The point is that efficiency cannot
>> be so bad, if this technique is used everywhere.

> It's not "everywhere", only when communicating with the world external
> to the process where it's unavoidable.

And communicating with a native library in this respect is very much
like communicating with the external world. You're doing something
that is not native to the language you are interpreting, that is
a later add-on which is not planned for in advance.

> People expect this to be less efficient than local computation,
> libraries use buffering and caching to reduce the amount of physical
> communication.

And in the same way, people can live with a bit less efficiency in
additional libraries. And the little time lost during the lookup
is not going to be the bottleneck; if you want to increase efficiency,
other places are much more suitable for optimization.

So yes, you pay, but you don't pay enough to actually have to get
worked up about it :-) And it still works fine with static typing.

- Dirk


Donn Cave

unread,
Oct 4, 2005, 12:55:41 PM10/4/05
to
In article <dhthid$t7g$1...@oravannahka.helsinki.fi>,
Panu Kalliokoski <ate...@sange.fi> wrote:
...

> If it pleases you, here is another attempt at a definition of what a
> type is specifically in programming languages: any property that must be
> known to perform any operation other than referring to the value or
> testing the value for identity. If one wants GC-like properties to be
> deemed "typeless", one can extend this with copying the value and
> testing the value for structural equality.

That makes some sense to me. I wonder though, how deeply do
you want to distinguish "property"? Suppose I have a functional
context like this,

let hostname = service.dest()

where hostname is a 7-bit ASCII string.

For conventional static analysis, this means service
must be of a type that has an attribute "dest", function
with no parameters returning a String. If you're going
to check type at run time, same story.

But in a language like Python where types aren't really
checked in this sense, we aren't going to care if this
"dest" function is there or what it does, until we use it.
At that point of application, it seems to me that we have
a somewhat unbounded expectation. The attribute must exist,
it must return a String, but moreover it had better actually
be the name of the service host, the DNS name if possible,
and depending on the application we may expect the canonical
host name or not, with various possible definitions of
"canonical" (full reverse lookup, follow CNAME aliases but
no reverse lookup, whatever getaddrinfo does.) That could
have some bearing on eventual success of the program, for
example if needs to acquire authentication credentials for
that service by its canonical name.

At some point it's probably absurd to think of this in
terms of type mismatch, but I don't see any clear reason
for the Python programmer to draw the line at the same
place as the type checker would have to. There seems to
be an "intentional" type here, that the more superficial
analytical type only partly captures.

Donn Cave, do...@u.washington.edu

Marcin 'Qrczak' Kowalczyk

unread,
Oct 4, 2005, 3:19:54 PM10/4/05
to
Panu Kalliokoski <ate...@sange.fi> writes:

> But many statically typed languages do not actually require their
> compilers to reject any program with type errors at compile time...

Which ones?

>>Sure. Usually it agrees though.
>
> "it" = the way different languages call something "types"?

Yes. While it's not always exactly the thing we are looking for,
often it's close.

> Do you see some specific problem coming? Or does it not match
> your intuition, or what most languages (but not Common Lisp) call
> "types"?

It doesn't seem to be helpful in defining static or dynamic typing
(all languages are roughly equivalent in what they can do with
arbitrary properties), it doesn't match my intuition, and it doesn't
match the terminology used in most languages (except Common Lisp).

We already have the term "property". Let's use it for arbitrary
properties, or perhaps with some adjective if you want to restrict
them to be computable or something.

> If it pleases you, here is another attempt at a definition of what
> a type is specifically in programming languages: any property that
> must be known to perform any operation other than referring to the
> value or testing the value for identity. If one wants GC-like
> properties to be deemed "typeless", one can extend this with copying
> the value and testing the value for structural equality.

If by "perform" you mean "successfully perform", then it's too broad:
it includes all possible computable properties.

>>> But what, then, is a statically typed language? How about it being
>>> a language where a value of a some specific type cannot be treated
>>> as a value of a more generic type -- possibly because the language
>>> does not have a concept for the more generic type?
>>
>>Then Java would be statically typed (because primitive types like int
>>don't have subtypes) but C# would not (because object is a supertype
>>of all types, including int).
>
> Does C# have checked downcasts? If it has and if everything has a
> common supertype in C#, it really is my intuition that C# is not
> statically typed.

It has. It shouldn't imply that it's not statically typed: you said
that for you static and dynamic typing are not exclusive, so it could
only shows that it is dynamically typed, or supports dynamic typing.

Anyway, the fact that in C# everything is a subtype of object, and in
Java primitive types are excluded, should not make a difference wrt.
supporting static and dynamic typing. It's too small detail to matter.

> Maybe I should go my original way and define a language to be
> statically typed when, for every expression, it requires the
> programmer to constrain the type of every value of that expression
> to be the known -- with respect to other expressions, to account
> for type variables.

This should be very similar to my definition of being statically typed
when accepting only programs with consistent usage of types (if the
programmer must constrain the type, it's probably for the compiler to
reject programs with inconsistent constraints).

This applies even if the language's type consistency rules don't
really guarantee type consistency, as in the case of Eiffel, which
should be called statically typed even though its type system is
unsound.

The definition is relative to what we call a type. If we say that all
values have the same type Top, then all programs are trivially checked
and accepted, and all languages are statically typed. If we allow
arbitrary properties as types, then compilers will not check them at
compile time, and no languages are statically typed. This definition
works only as long as "types" are defined at some reasonable level of
precision.

> Does Perl provide a way to distinguish between different types of
> scalars?

Yes:

ref EXPR
ref Returns a non-empty string if EXPR is a reference, the empty
string otherwise. If EXPR is not specified, $_ will be used.
The value returned depends on the type of thing the reference
is a reference to. Builtin types include:

SCALAR
ARRAY
HASH
CODE
REF
GLOB
LVALUE

If the referenced object has been blessed into a package, then
that package name is returned instead. You can think of "ref"
as a "typeof" operator.

if (ref($r) eq "HASH") {
print "r is a reference to a hash.\n";
}
unless (ref($r)) {
print "r is not a reference at all.\n";
}
if (UNIVERSAL::isa($r, "HASH")) { # for subclassing
print "r is a reference to something that isa hash.\n";
}

> For instance, can I tell the string "ARRAY(0xdeadbeef)" from a
> reference pointing to that location? If not, can I forge a reference
> by making a string with those contents?

You can't forge a reference thay way. Also, a reference causes the GC
to keep the referred object alive.

>>The set of builtin types is so small that you generally encode what
>>would be different types in other languages by different values,
>>e.g. having dictionaries with a distinguished key which serves for
>>checking the nature of the entity it represents. It might become
>>similar to emulating a dynamic type system within a fixed set of
>>static types. It's all pretty foggy.
>
> That's why people talk about the "power" of a type system. But I
> wouldn't say that a language with a more powerful static type system
> is more statically typed...

I would say that in cases when there are very few types it's harder to
tell whether the language is statically typed. Especially in the case
of Perl with quite irregular semantics. For example the first four
kinds on the list above (SCALAR..CODE) are distinguished syntactically
when they are named directly, but others aren't; they have some
subtyping relationships ("REF" and "GLOB" seem to be subtypes of
"SCALAR", or perhaps all three are subtypes of things which can be put
in scalar variables), so I don't know whether references and globs
should be considered separate types from other scalars or not.

>>That's why I said "routinely and predictably". In Lisp the degree of
>>detection varies with the sophistication of the implementation, it's
>>not a part of the language definition.
>
> But should we consider, for example, C interpreters > "implementations"
> of C? I think we should, even though they probably will not complain
> about a type violation until runtime.

They should still complain before running any code, not when the
offending piece is executed. At least during a single read-eval-print
cycle. Otherwise I would claim that they don't implement C, which
clearly says what properties must be satisfied statically.

> If a language must, in order to be statically typed, require its
> implementations to reject type-violating programs before running
> them even partially, then a lot of languages usually deemed
> statically typed aren't actually statically typed.

Which ones?

>>The "type checker" can be fooled into accepting the program by
>>rewriting the program in trivial ways.
>
> I don't understand. If I generate compile errors programmatically,
> I can be as strict as I want to be.

A Lisp compiler will probably fail to detect a type error if I
e.g. insert an indirectly called identity function, or perform
other changes which don't really change which operations are actually
performed but the compiler can no longer statically infer which values
are passed where.

If in the case of doubt a language assumes that the program is
correct, it's dynamically typed. If it rejects it just to be sure,
it's statically typed.

Panu Kalliokoski

unread,
Oct 5, 2005, 1:24:55 AM10/5/05
to
In <donn-703FD8.0...@gnus01.u.washington.edu> Donn Cave <do...@u.washington.edu> writes:
>> If it pleases you, here is another attempt at a definition of what a
>> type is specifically in programming languages: any property that must be
>> known to perform any operation other than referring to the value or
>> testing the value for identity. If one wants GC-like properties to be
>> deemed "typeless", one can extend this with copying the value and
>> testing the value for structural equality.
>That makes some sense to me. I wonder though, how deeply do
>you want to distinguish "property"? Suppose I have a functional
>context like this,
[... explanation how "the set of all FQ hostnames can be a type ...]

My first definition of "type" (which I think is more useful) was
intended to include types like this. My second definition (to which you
comment) was not meant to do so. I'm not sure whether you are
suggesting that it does include types like this; anyway, a better
wording might be "any property that must be known to perform any
operation other than ..., and from which no other such property can be
deduced (i.e. it does not have a "superproperty" with the same
characteristics).

>place as the type checker would have to. There seems to
>be an "intentional" type here, that the more superficial
>analytical type only partly captures.

I think the definitions on static and dynamic typing can be based on a
definition of type that includes these "intentional" types, but the
definition above was purposefully devised to better catch the intuition
of what is usually meant by a "type" in the programming language
parlance.

Philippa Cowderoy

unread,
Oct 5, 2005, 9:17:32 AM10/5/05
to
On Tue, 4 Oct 2005, Marcin 'Qrczak' Kowalczyk wrote:

> If by "perform" you mean "successfully perform", then it's too broad:
> it includes all possible computable properties.
>

Nothing wrong with that - there're type systems that can represent them.

--
fli...@flippac.org

The task of the academic is not to scale great
intellectual mountains, but to flatten them.

Lauri Alanko

unread,
Oct 9, 2005, 8:51:14 AM10/9/05
to
In article <slrndk0c92....@texas.cs.uiuc.edu>,

Joe Hendrix <jhen...@uiuc.edu> wrote:
> I think the biggest problem is decidability. In a theorem proving
> environment, one can use the prover to show type correctness.

Are you sure? At least in Coq, you are manually solving the type
_inhabitation_ problem: trying to come up with a term (proof) that has
the desired type (proposition). Once you have come up with a proof,
_checking_ it is quite straightforward. Comparing type equality is
complicated but decidable (the type language is strongly normalizing).


Lauri

Lauri Alanko

unread,
Oct 9, 2005, 9:08:10 AM10/9/05
to
In article <dhiepn$ci0$1...@oravannahka.helsinki.fi>,
Panu Kalliokoski <ate...@sange.fi> wrote:
> I fail to see how OCaml's subtyping is "limited", and I've always deemed
> OCaml the language where type subsumption (of classes) is _implicit_,
> based on the subsumption relation of their type signatures.

No, you need explicit coercions occasionally:

# class type foo = object method x : int end;;
class type foo = object method x : int end
# class bar = object method x = 4 method y = 3 end;;
class bar : object method x : int method y : int end
# let baz : foo = new bar;;
This expression has type bar but is here used with type foo
Only the first object type has a method y
# let baz : foo = (new bar :> foo);;
val baz : foo = <obj>
# let quux (x:foo) = ();;
val quux : foo -> unit = <fun>
# quux (new bar);;
This expression has type bar but is here used with type foo
Only the first object type has a method y
# quux (new bar :> foo);;
- : unit = ()


Lauri

Joachim Durchholz

unread,
Oct 9, 2005, 11:38:26 AM10/9/05
to
Lauri Alanko schrieb:

I think the real issue is making the type language as expressive as
possible while retaining decidability.
In practice, decidability isn't exactly enough, we'd also want a
polynomial or even O(N log N) complexity and a reasonably simple
implementation. Plus we want something that people can perform in their
heads, else emitting an understandable error message might well become
undecidable or impossible. Plus any system of types should be
decomposable into constitutent parts.

Not a simple problem, that :-)

Regards,
Jo

Lauri Alanko

unread,
Oct 9, 2005, 2:23:33 PM10/9/05
to
In article <dibdhj$ur4$1...@online.de>,

Joachim Durchholz <j...@durchholz.org> wrote:
> I think the real issue is making the type language as expressive as
> possible while retaining decidability.

Why is decidability so important? There are quite interesting type
systems that are undecidable but nevertheless usable in practice.
Usually undecidability only occurs in pathological cases and then it
is not a big deal to have a fixed cutoff limit. Yes, then there will
be some well-typed programs that the checker won't accept, but most
likely they will be due to such awful types that you shouldn't be
using in the first place. :)


Lauri

Joachim Durchholz

unread,
Oct 9, 2005, 2:50:56 PM10/9/05
to
Lauri Alanko schrieb:

The problem is that there's a lot of handwaving involved with that
argumentation. Saying "usually" and "most likely" doesn't help me if I
really happen to run into one of those fringe cases - and I'd be hard
pressed to tell which kinds of types will trigger this kind of problem,
so I can't simply program ahead.
And if I ever *do* hit such a type, there's no good exit strategy.

BTW I have been bitten by "usually"s and "most likely"s before. That
time, it was a language designer arguing that the holes in his type
system were too unlikely to ever be of practical relevance. It turned
out that as technology changed, its use changed, and what was highly
unlikely initially became commonly desirable over time. Without an exit
strategy, this means the language is doomed.

If a type system has undecidable (or simply too-long-running) borderline
cases, I'd demand that these cases be clearly delineated in the docs.
Unless that has been done, this only means that the type system
designers didn't do their homework.
About the only excuse I'd accept is if the type system will accept hints
that help the type inferencer along. And if there are good annotation
strategies that are known to work in all cases (besides the trivial
"annotate everything", of course *g*).

Regards,
Jo

Andreas Rossberg

unread,
Oct 10, 2005, 7:17:10 AM10/10/05
to
Joachim Durchholz wrote:
>
> I think the real issue is making the type language as expressive as
> possible while retaining decidability.
> In practice, decidability isn't exactly enough, we'd also want a
> polynomial or even O(N log N) complexity and a reasonably simple
> implementation.

You may be taking the problem too serious. Even our beloved
Hindley/Milner type inference algorithm already has exponential worst
case behaviour (so your requirement is prohibitive for polymorphic type
inference). As far as I know, that never has been a problem in practice
for almost 3 decades. Likewise, a number of programming languages have
had undecidable type systems for ages without ever causing practical
problems.

- Andreas

--
Andreas Rossberg, ross...@ps.uni-sb.de

Let's get rid of those possible thingies! -- TB

Joachim Durchholz

unread,
Oct 10, 2005, 8:05:44 AM10/10/05
to
Andreas Rossberg schrieb:

> Likewise, a number of programming languages have
> had undecidable type systems for ages without ever causing practical
> problems.

Which?

Regards,
Jo

Andreas Rossberg

unread,
Oct 10, 2005, 8:17:46 AM10/10/05
to
Joachim Durchholz wrote:
>
>> Likewise, a number of programming languages have had undecidable type
>> systems for ages without ever causing practical problems.
>
> Which?

OCaml, for example. For Haskell, a lot of people happily toggle the
-fallow-undecidable-instances or -fallow-overlapping-instances switch of
GHC to do type-level programming. C++ templates are undecidable modulo
the (implementation-dependent) brute-force restriction on instantiation
depth (I'm less sure about the lack of practical problems here). There
surely are more.

Mark Carroll

unread,
Oct 10, 2005, 9:29:13 AM10/10/05
to
In article <didm5a$8rt5a$5...@hades.rz.uni-saarland.de>,
Andreas Rossberg <ross...@ps.uni-sb.de> wrote:
(snip)

>OCaml, for example. For Haskell, a lot of people happily toggle the
>-fallow-undecidable-instances or -fallow-overlapping-instances switch of
>GHC to do type-level programming. C++ templates are undecidable modulo
(snip)

I've been wondering, with the odd clever stuff that jhc does, like the
lambda-cube and things, if that affects what options would be needed
for a more mature jhc rival to ghc. Would there be any change to what
would be judged undecidable and whatever, or do both ways of analysing
types come up with the same answers?

-- Mark

Joe Hendrix

unread,
Oct 11, 2005, 12:37:45 AM10/11/05
to
Lauri Alanko wrote:
> Are you sure? At least in Coq, you are manually solving the type
> _inhabitation_ problem: trying to come up with a term (proof) that has
> the desired type (proposition).

No, I'm not sure -- I don't really know much about Coq. Does Coq
provide significant assistence in coming up with the term(proof)?

The main theorem prover I have experience with is ACL2. It is based on
a first-order applicative subset of Lisp, and it does not have types,
but it supports something like domain verification. One can assign a
proposition called a guard to a function, and ask the theorem prover to
verify that if the guard is satisfied, every function call in the body
of the function satisfies the guard of that function for the specified
arguments. This is an issue both for verifying correctness and
maximimizing efficiency. This is quite powerful since arbitrary
functions may be used, but is not always used since guard verification
is often nontrivial.

> Comparing type equality is complicated but decidable (the type language
> is strongly normalizing).

I did not know that. I'll need to read about that decision procedure at
some point.

Best regards,
Joe

Message has been deleted

Jon Harrop

unread,
Oct 18, 2005, 8:37:24 AM10/18/05
to
Andreas Rossberg wrote:
> Joachim Durchholz wrote:
>>> Likewise, a number of programming languages have had undecidable type
>>> systems for ages without ever causing practical problems.
>>
>> Which?
>
> OCaml, for example.

What is undecidable in the OCaml type system?

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com

It is loading more messages.
0 new messages