Dependent types and how to get rid of them

123 points by pie_flavor 10 days ago

Recent and related: Why don't you use dependent types? - https://news.ycombinator.com/item?id=45790827 (116 comments)

nixpulvis 3 days ago

I read the first post and thought someone should at the very least post the lambda cube. This isn't my area of expertise, since I've done very little with dependent types (staying firmly on the left side of the cube), but it outlines some useful categories of type systems here.

https://en.wikipedia.org/wiki/Lambda_cube

azdavis 3 days ago

I wrote a small post on that: https://azdavis.net/posts/lambda-cube/
Hope it’s helpful!
- nixpulvis 3 days ago
  
  Cool that you worked on an implementation of the CoC.
  I've been somewhat wanting to go back and revisit ATAPL and read chapter 2 on this subject: https://www.cis.upenn.edu/~bcpierce/attapl/frontmatter.pdf
- ChadNauseam 3 days ago
  
  That is a good post. I've linked to it from mine!
  - azdavis 3 days ago
    
    Thanks! But I don’t think it quite worked?
    
    ChadNauseam 3 days ago
    
    Ought to be fixed now

andyferris 3 days ago

I see we need to add special syntax to the signature for dependent type variables.

If you take Zig, it's `comptime` parameters are kind of similar. They can be used to create functions that return types or whose output type depends on the inputs, etc. It seems to fulfil the three axioms at the start, no? The erasure stuff seems just as relavant.

Can I say that `comptime` is dependent types in imperative clothing? Or is there a missing capability making it strictly weaker?

codebje 3 days ago

You could probably say that. AFAIK there isn't a single valid definition of a dependently typed language any more than there is a single valid definition of a functional language.
I usually go with "you can make a dependent pair", which is two terms where the type of the second depends on the value of the first; I think you could do that in Zig with a bit of handwaving around whether the comptime expression really is the first value or not.
What Zig's doing is also, as best I understand it all, the only real path to anything like dependent types in imperative languages: make a separate sub-language for constant expressions that has its own evaluation rules that run at compile time. See also C++ constexpr, or the constants permitted in a Rust static array size parameter (practically a dependent type itself!)
A functional language's basic building block is an expression and terms are built out of composed expressions; normalisation means reducing a term as far as possible, and equality is (handwaving!) when two terms reduce to exactly the same thing. So you can do your dependent terms in the usual source language with the usual evaluation rules, and those terms are freely usable at runtime or compile time, it's all the same code.
An imperative language's basic building block is a memory peek or poke, with programs composed of sequences of peeks and pokes scheduled by control flow operators. It's much less clear what "normalisation" looks like here, so imperative languages need that separate constant-value embedded language.
- jibal 2 days ago
  
  Zig's sub-language and evaluation rules are exactly the same as the Zig language except that `type` is a first class type and the TypeInfo of any type can be both examined and modified. In Zig, generics are implemented by computing types--most often by a function returning a struct that is specialized to a type argument to the function.
  I've been writing some D code and, while D "template" functions can take compile-time arguments that are types, they can only produce types via mixins, which are strings that are injected at the point of invocation and then parsed by the compiler. This gives you all of the power of Zig and then some, but it's not nearly as elegant (it would help a lot if D's token quote construct q{...} had lisp-like quasi-quote functionality). OTOH, I found when writing Zig it could become extremely confusing to have the comptime and runtime languages intermixed, which is often done for loop unrolling and "conditional compilation" -- whether it's a comptime or runtime operation is in many cases a function of whether the variables involved are comptime or runtime types. In D it's generally clearer because comptime operations are distinguished with such things as static if for flow control, enum for declaring comptime variables, foo!(comptime args)(runtime args) for function calls, etc.
  - codebje 2 days ago
    
    Perhaps you only need the clarity because the rules for what's acceptable at comptime are not the same as the rules for what's acceptable at runtime.
ChadNauseam 3 days ago
I'm not familiar with Zig, but I'm pretty sure comptime is not quite as powerful.
In the post, I didn't give any examples of situations where the input to a dependent function was not known at compile time. However, that was just due to the lack of time when writing the post.
I gave this example:
```
    useValue : Bool -> String
    useValue True = "The number is: " ++ natToString (getValue True)
    useValue False = "The message is: " ++ getValue False
```
The difference is that nothing in dependent types requires that the depended-on value `b` is known at compile time. You could pass any Bool to `getValue`, including one that was read as input from the user. It would just be the case that, before using the value returned by `getValue b`, you'd have to check (at runtime) the value of `b`. Only then would you know `pickType b`, which would tell you the type of `getValue b`. My apologies that the post was unclear about this. (I've just updated it.)
- spockz 3 days ago
  
  If you don’t know at compile time what the type of the value is that is returned, how do you know whether the remainder of the expression is still correct?
  The choice is either to have a total function that will be able to deal with every type and check at compile time whether they exist or to defer to the runtime and then throw an exception if the program doesn’t know what to do with the value.
  - ChadNauseam 3 days ago
    
    The value returned by `getValue b` has type `pickType b`. It cannot be used in a way that depends on its type until you check the value of `b`. If `b = True`, then you know that `pickType b = pickType True = Nat` and that therefore `getValue b` has type `Nat`. There is no option to check the type of a value at runtime. You simply cannot use it in such a way as to create a runtime type error. (This is what it means for the types to be erased. The runtime doesn't know what type it's supposed to be!)
  - rtpg 2 days ago
    
    getValue : (b : Bool) -> pickType b getValue True = 42 getValue False = "hello"
    So given something like this, imagine something like the following:
    someBool = flipACoinToGetABool thingy = getValue someBool
    You're already in trouble here right? what's the type of thingy?
    So maybe you write:
    thingy: (pickType someBool) = getValue someBool
    which is fine and great!
    then you write:
    print (thingy + 3) -- thingy can be an int right?
    But at this point thingy is of type "pickType someBool" while + is expecting Int. So it's up to you to restructure your code to prove to the type checker that someBool is in fact True.
    For example:
    print ((if someBool then thingy else 5) + 3)
    ^ and here your code proves the fact! So in practice when you want to explore the results that are hidden behind the dependent types, you will need to have written code that unwraps it like this. But the whole point is the underpinning type theory means that you will be fine while doing it! You've proven that it'll be fine, so if you add thingy to 3 you'll have an integer.
    I think of this stuff a bit like encryption password. You have this bundle of data, but to get in, you need to provide proof that what you're doing won't blow up. You'll need to write your code in a certain way to do it. But if you provide the right proof, then we're golden.
  - skybrian 2 days ago
    
    Consider how Rust monomorphizes generic functions, making a copy for each possibility. I imagine a compiler could generate a different continuation for each possible return type, along with a runtime check that branches to the appropriate continuation.
    It seems like that might cause code bloat, though?
- ioanaci 2 days ago
  I commented this before but deleted it because I hadn't tested my code properly. Anyway, here is a version that works.
  You're right Zig isn't quite as powerful, but you don't need type erasure if you have enough monomorphization. It's a bit annoying to use since you have to capture the comptime version of b inside the inline switch prongs, and get_value does need to take a comptime bool instead of a runtime one. It's functionally the same, except that instead of using the bool to prove the type of the type-erased value is correct and delaying the actual if (b) check to runtime, we're moving it to compile time and instead proving that b has the right value for each case, then generating specialized code.
  This does NOT mean that it isn't dependent on user input, the call do_thing(get_user_input()) would absolutely work. do_thing has no compile-time parameters at all.
  I don't have the natToString thing because it's a bit more annoying in Zig and would obscure the point.
  const std = @import("std"); fn get_value(comptime b: bool) if (b) u32 else []const u8 { if (b) return 12 else return "string"; } fn do_thing(b: bool) void { switch (b) { inline true => |known_b| std.log.info("{d}", .{get_value(known_b)}), inline false => |known_b| std.log.info("{s}", .{get_value(known_b)}), } } pub fn main() void { do_thing(true); do_thing(false); }
  You could define a function
  fn pick_type(comptime b: bool) type { if (b) return u32 else return []const u8; }
  and change the signature
  fn get_value(comptime b: bool) pick_type(b)
  if you wish.
  Perhaps more convincingly, you can use inline for with a similar effect. It's obviously horrible code nobody would ever write but I think it might illustrate the point.
  fn do_thing(b: bool) void { inline for (0..2) |i| { // go through all bool variants const known_b = (i != 0); if (known_b == b) { // Essentially what we have now is a known_b which is proven // at compile-time to be equal to whatever runtime b we have. // So we're free to do any kind of dependent type things we'd like // with this value. const value = get_value(known_b); // Here we know the type of value but it still depends on the runtime b. if (known_b) std.log.info("{d}", .{value}) // treated as an int else std.log.info("{s}", .{value}); // treated as a string // We can also just decide based on the type of value at this point. if (@TypeOf(value) == u32) ... else ...; // Or, a switch switch (@TypeOf(value)) { u32 => ..., []const u8 => ..., } } } }
pron 2 days ago

Not really. With dependent types, you can have a function that returns a type that depends on a runtime value; with Zig's comptime, it can only depend on a compile-time value.
Note that a language with dependent types doesn't actually "generate" types at runtime (as Zig does at compile-time). It's really about managing proofs that certain runtime objects are instances of certain types. E.g. you could have a "prime number" type, and functions that compute integers would need to include a proof that their result is a prime number if they want to return an instance of the prime number type.
Using dependent types well can be a lot of work in practice. You basically need to write correctness proofs of arbitrary properties.
- CoastalCoder 2 days ago
  
  Sorry if this is obvious, but do these languages let programmers explicitly assert that a certain, relevant property holds at runtime?
  Or do these compilers insist on working these proofs out on there own without the benefit of programmer-supplied assertions?
  - pron 2 days ago
    
    > do these languages let programmers explicitly assert that a certain, relevant property holds at runtime?
    Yes, but the programmer has to do the really, really, really hard work of proving that's the case. Otherwise, the compiler says: you haven't proved to me that it's true, so I won't let you make this assertion. Put another way: the compiler checks your proof, but it doesn't write it for you.
    The only programs for which all interesting assertions have been proven in this, or a similar, way have not only been small (up to ~10KLOC), but the size of such programs relative to that of the average program has been falling for several decades.
  - naasking 2 days ago
    
    > Sorry if this is obvious, but do these languages let programmers explicitly assert that a certain, relevant property holds at runtime?
    You can assert something but then you also have to prove it. An assertion is like a type declaration: you're asserting values of this type can exist. The proof of that assertion is constructing a value of that type to show that it can be instantiated.
    This is what the Curry-Howard correspondence means. The types of most languages are just very weak propositions/assertions, but dependent types let you declare arbitrarily complex assertions. Constructing values of such arbitrarily complex assertions can get very hairy.

aatd86 3 days ago

I am slowly trying to understand dependent types but the explanation is a bit confusing to me as, I understand the mathematical terminology of a function that may return a type, but... Since function types take a value and return a value, they are by definition in another universe from say morphisms that would take a type and return a type.

The same way, I see a value as a ur-element and types as sets of values. So even if there is syntactic sugar around the value <-> type equivalence, I'd naively think that we could instead define some type morphism and that might be more accurate. The value parameter would merely be declared through a type parameter constrained to be a singleton. The same way a ur-element is not a set but a member of set.

Then the question is representation but that could be left as an optimization. Perhaps that it is already what is done.

Example:

type Nine int = {9} And then the rest is generic functions, parameterizable by 9, or actually, Nine.

So nothing too different from a refinement of int.

Basically, 'value' would be a special constraint on a type parameter in normal parametric polymorphism implementations. There would probably be derived constraint information such as size etc...

But I guess, the same issue of "which refinement types can be defined, while keeping everything decidable" remains as an issue.

Also how to handle runtime values? That will require type assertions, just like union types? Or is it only a compile time concept and there is no runtime instantiations. Only some kind of const generics?

A typeof function could be an example of dependent type though? Even at runtime?

Just wondering...

nathanrf 3 days ago

In the style of the linked post, you'd probably define a generic type (well, one of two generic types):
type ExactlyStatic : (0 t: Type) -> (0 v: t) -> Type
type ExactlyRuntime : (0 t: Type) -> (v: t) -> Type
Then you could have the type (ExactlyStatic Int 9) or the type (ExactlyRuntime Int 9).
The difference is that ExactlyStatic Int 9 doesn't expose the value 9 to the runtime, so it would be fully erased, while (ExactlyRuntime Int 9) does.
This means that the runtime representation of the first would be (), and the second would be Int.
> Also how to handle runtime values? That will require type assertions, just like union types?
The compiler doesn't insert any kind of runtime checks that you aren't writing in your code. The difference is that now when you write e.g. `if condition(x) then ABC else DEF` inside of the two expressions, you can obtain a proof that condition(x) is true/false, and propagate this.
Value representations will typically be algebraic-data-type flavored (so, often a tagged union) but you can use erasure to get more efficient representations if needed.
SabrinaJewson 3 days ago

In type theory, all singleton types are isomorphic and have no useful distinguishing characteristics (indeed, this is true of all types of the same cardinality – and even then, comparing cardinalities is always undecidable and thus irrelevant at runtime). So your Nine type doesn’t really make sense, because you may as well just write Unit. In general, there is no amount of introspection into the “internal structure” of a type offered; even though parametricity does not hold in general (unless you postulate anticlassical axioms), all your functions that can run at runtime are required to be parametric.
- Maxatar 3 days ago
  
  Being isomorphic is not the same as being identical, or substitutable for one another. Type theory generally distinguishes between isomorphism and definitional equality and only the latter allows literal substitution. So a Nine type with a single constructor is indeed isomorphic to Unit but it's not the same type, it carries different syntactic and semantic meaning and the type system preserves that.
  Some other false claims are that type theory does not distinguish types of the same cardinality. Type theory is usually intensional not extensional so two types with the same number of inhabitants can have wildly different structures and this structure can be used in pattern matching and type inference. Cardinality is a set-theoretic notion but most type theories are constructive and syntactic, not purely set-theoretic.
  Also parametricity is a property of polymorphic functions, not of all functions in general. It's true that polymorphic code can't depend on the specific structure of its type argument but most type theories don't enforce full parametricity at runtime. Many practical type systems (like Haskell with type classes) break it with ad-hoc polymorphism or runtime types.
  - SabrinaJewson 2 days ago
    
    This comment contains a lot of false information. I’m first going to point out that there is a model of Lean’s type theory called the cardinality model, in which all types of equal cardinality are modelled as the same set. This is why I say the types have no useful distinguishing characteristics: it is consistent to add the axiom `Nine = Unit` to the type theory. For the same reason, it is consistent to add `ℕ = ℤ` as an axiom.
    > So a Nine type with a single constructor is indeed isomorphic to Unit but it's not the same type, it carries different syntactic and semantic meaning and the type system preserves that.
    It carries different syntax but the semantics are the exact same.
    > Type theory is usually intensional not extensional so two types with the same number of inhabitants can have wildly different structures
    It is true that type theory is usually intensional. It is also true that two types equal in cardinality can be constructed in multiple different ways, but this has nothing to do with intensionality verses extensionality – I wouldn’t even know how to explain why because it is just a category error – and furthermore just because they are constructed differently does not mean the types are actually different (because of the cardinality model).
    > Cardinality is a set-theoretic notion but most type theories are constructive and syntactic, not purely set-theoretic.
    I don’t know what you mean by this. Set theory can be constructive just as well as type theory can, and cardinality is a fully constructive notion. Set theory doesn’t have syntax per se but that’s just because the syntax is part of logic, which set theory is built on.
    > most type theories don't enforce full parametricity at runtime
    What is “most”? As far as I know Lean does, Coq does, and Agda does. So what else is there? Haskell isn’t a dependent type theory, so it’s irrelevant here.
    ---
    Geniune question: Where are you sourcing your information from about type theory? Is it coming from an LLM or similar? Because I have not seen this much confusion and word salad condensed into a single comment before.
    
    aatd86 a day ago
    
    I will let Maxatar responds if he wants to but I will note that his response makes much more sense to me than yours as someone who uses traditional programming language and used to do a little math a couple decades ago.
    With yours, it seems that we could even equate string to int.
    How can a subtype of the integer type, defined extensionally, be equal to Unit? That really doesn't make any sense to me.
    > it is consistent to add `ℕ = ℤ` as an axiom Do you have a link to this? I am unaware of this. Not saying you're wrong but I would like to explore this. Seems very strange to me.
    As he explained, an isomorphism does not mean equality for all I know. cf. Cantor. How would anything typecheck otherwise? In denotational semantics, this is not how it works. You could look into semantic subtyping with set theoretic types for instance.
    type Nine int = {9} defines a subtype of int called Nine that refers to the value 9. All variables declared of that type are initialized as containing int(9). They are equal to Nine. If you erased everything and replaced by Unit {} it would not work. This is a whole other type/value.
    How would one be able to implement runtime reflection then?
    I do understand that his response to you was a bit abrupt. Not sure he was factually wrong about the technical side though. Your knee-jerk response makes sense even if it is too bad it risks making the conversation less interesting.
    Usually types are defined intensionally, e.g. by name. Not by listing a set of members (extensional encoding) in their set-theoretic semantic. So maybe you have not encountered such treatment in the languages you are used to?
    edit: the only way I can understand your answer is if you are only considering the Universe of types as a standalone from the Universe of values. In that universe, we only deal with types and types structured as composites in what you are familiar of perhaps? Maybe then it is just that this Universe as formalized is insufficiently constrained/ underspecified/ over-abstracted?
    It shouldn't be too difficult to define specific extensional types on top, of which singleton types would not have their extensional definitions erased.
  - tome a day ago
    
    > Many practical type systems (like Haskell with type classes) break it with ad-hoc polymorphism or runtime types.
    Haskell does not break parametricity. Any presence of ad-hoc polymorphism (via type classes) or runtime types (via something like Typeable, itself a type class) is reflected in the type signature and thus completely preserves parametricity.
    
    aatd86 a day ago
    
    I think what they meant is that it is not purely parametric polymorphism. Not that parametricity is broken.
    
    tome a day ago
    
    Hmm, maybe
    > most type theories don't enforce full parametricity at runtime
    means "sometimes types can appear at run time"? If so it's true, but it's not what I understand by "parametricity".
    
    aatd86 a day ago
    
    Not sure, either : Parametric polymorphism is compile time. Adding runtime behaviors is a sort of escape hatch.
    Which is fair although necessary. It's not so much a breakage as it is a necessity, since non parametric code may rely on this too.
    Or maybe it is about constrained parametricity. In any case this doesn't seem a big issue.
- aatd86 3 days ago
  
  How does it become Unit if it is an integer of value 9? Why would cardinalities be undecidable if they are encoded discretely in the type?
  For instance, type Nine int = {9} would not be represented as 9. It would probably be a fat pointer. It is not just an int, it would not even have the same operations (9 + 9 is 18 which is an int but is not of type Nine, but that's fine, a fat pointer does not need to share the same set of operations as the value it wraps).
  It could be seen as a refinement of int? I am not sure that it can truly be isomorphic? My suspicion was that it would only be somewhat isomorphic at compile time, for type checking, and if there is a mechanism for auto unwrapping of the value?
  - codebje 2 days ago
    
    There's only one possible value of type Nine; there's only one possible value of type Unit. They're isomorphic: there's a pair of functions to convert from Nine to Unit and from Unit to Nine whose compositions are identities. Both functions are just constants that discard their argument un-inspected. "nineToUnit _ = unit" and "unitToNine _ = {9}".
    You've made up your language and syntax for "type Nine int = {9}" so the rules of how it works are up to you. You're sort of talking about it like it's a refinement type, which is a type plus a predicate over values of the type. Refinement types aren't quite dependent types: they're sort of like a dependent pair where the second term is of kind Proposition, not Type; your type in Liquid Haskell would look something like 'type Nine = Int<\n -> n == 9>'.
    Your type Nine carries no information, so the most reasonable runtime representation is no representation at all. Any use of the Nine boils down to pattern matching on it, and a pattern match on a Nine only has one possible branch, so you can ignore the Nine term altogether.
    
    aatd86 2 days ago
    
    Why doesn't it seem exactly true?
    It is an integer. It is in the definition. And any value should be equal to nine. By construction Nine could have been given the same representation as int, at first. Except this is not enough to express the refinement/proposition.
    One could represent it as a fat pointer with a space allocated to the set of propositions/predicates to check the value of.
    That allows to check for equality for instance.
    That information would not be discarded.
    This is basically a subtype of int.
    As such, it is both a dependent type and a refinement type. While it is true that not all refinement types are dependent types, because of cardinality.
    I think Maxatar response in the same thread puts it in words that are possibly closer to the art.
    
    codebje 2 days ago
    
    When does the predicate get tested?
    Also, it's not a dependent type. The type does not depend on the value. The value depends on the type.
    
    aatd86 a day ago
    
    The predicate gets tested every time we do type checking? It is part of the type identity. And it is a dependent type. Just like an array type is a dependent type, the actual array type depending on the length value argument.
    edit: I think I am starting to understand. In the implementations that are currently existing, Singleton types may be abstracted. My point is exactly to unabstract them so that the value is part of their identity.
    And only then we can deal with only types i.e. everything from the same Universe.
    
    codebje 15 hours ago
    
    > The predicate gets tested every time we do type checking? It is part of the type identity.
    When does type checking happen?
    I think it happens at compile time, which means the predicate is not used for anything at all at run time.
    > edit: I think I am starting to understand. In the implementations that are currently existing, Singleton types may be abstracted. My point is exactly to unabstract them so that the value is part of their identity.
    I am not sure what you mean by "to unabstract them so that the value is part of their identity", sorry. Could you please explain it for me?
    > And only then we can deal with only types i.e. everything from the same Universe.
    If you mean avoiding the hierarchy of universes that many dependently typed languages have, the reason they have those is that treating all types as being in the same universe leads to a paradox ("the type of all types" contains itself and that gets weird - not impossible, just weird).
    
    aatd86 5 hours ago
    
    > the predicate is not used for anything at all at run time.
    It is used for its value when declaring a new variable of a given type at runtime too. It has to be treated as a special predicate. The allocator needs to be aware of this. Runtume introspection needs to be aware of this. Parametric type instantiation also needs to know of this since it is used to create dependent types.
    The point is that in the universe of types that seems to be built in dependent types, singleton Types are just Types decorrelated from their set of values. So they become indistinguishable from each other, besides their name. Or so it seems that the explanation is. It is abstracted from their value. The proposal was to keep the set definition attached. What I call unabstract them.
    The point is exactly to avoid mixing up universes which can lead to paradoxes. Instead of dealing with types as some sorts of values with functions over types, mixed up with functions over values which are also types and then functions of types and values to make some sort of dependent types, we keep the algebra of types about types. We just bridge values into singleton types. Also, it should allow an implementation that relies mostly on normal constrained parametricity and those singleton types. The point is that mixing values and types (as values) would exactly lead to this type of all types issue.
    But again, I am not familiar enough with the dependent type implementations to know exactly what treatment they have of the issue.
pinkmuffinere 3 days ago

Hi aatd86! We had a different thread about existence a couple days ago, and I'm just curious -- is English your second language? What's your first language? I'm guessing French, or something from Central Europe.
Thanks for humoring me!
anon291 3 days ago

Functions that return types are indeed at a higher level than those that don't.
Values can be seen as Singleton types. The key difference is the universe they live in. In the universe of types, the level one level below is perceived as a value. Similarly, in the universe of kinds, types appear to be values.
> Basically, 'value' would be a special constraint on a type parameter in normal parametric polymorphism implementations
Yes this is a level constraint.
> But I guess, the same issue of "which refinement types can be defined, while keeping everything decidable" remains as an issue.
If you're dealing with fully computable types. Nothing is decidable.
> Also how to handle runtime values? That will require type assertions, just like union types? Or is it only a compile time concept and there is no runtime instantiations. Only some kind of const generics?
A compiler with dependent types is essentially producing programs that are itself embedded with its input. There cannot be a distinction between runtime and compel time. Because in general type checking requires you to be able to run a program. The compilation essentially becomes deciding which parts you want to evaluate and which parts you want to defer until later.
> A typeof function could be an example of dependent type though? Even at runtime?
Typeof is just const.
Typeof: (T : type) -> (x:T) -> Type
Final note: I've written some compilers for toy dependently typed languages. By far dependent typing makes both the language and the compiler easier not harder. This is because Haskell and c++ and other languages with type systems and metaprogramming or generics actually have several languages: the value language that we are familiar with, but also the type language.
In Haskell, this is the class/instance language which is another logic programming language atop the lambda calculus language. This means to write a Haskell compiler you have to write a compiler for Haskell and the logic programming language (which is turing complete btw) that is the type language
Similarly in c++, you have to implement c++ AND the template system which is a functional programming language with incredibly confusing semantics.
On the other hand for a dependently typed language, you just write one language... The types are talked about in the same way as values. The only difficulty is type inference. However, an explicit dependently typed language is actually the easiest typed compiler to write because it's literally just checking for term equality which is very easy! On the other hand with a dependent language, type inference is never going to be perfect so you're a lot freer to make your own heuristic decisions and forego HM-like perfection.

dcre 3 days ago

This made me realize TypeScript’s conditional types are closer to dependent types than I thought.

danvk 3 days ago

Yes. You can give pickType a type in TS. Something like pickType<B extends boolean>(b: B): B extends true ? string : number.
I’d love to read a post explaining how TS conditional types are or are not a form of dependent types. Or, I’d like to understand dependent types well enough to write that post.
- morcus a day ago
  
  I just tried this, and I couldn't quite get it to work.
``` type PickType<B extends Boolean> = B extends true ? number ? string;
function foo<T extends Boolean>(t: T): PickType<T> { return t ? 42 : "42; } ```
The compiler doesn't seem to get that the type of T is narrowed in the ternary branches, and thus Pick<T> is determined.
The error I get is that a number isn't assignable to a Pick<T>.
- dcre 2 days ago
  
  I asked GPT-5 about it before writing the above comment and got a pretty interesting answer about how, for example, you can’t write a function that takes a number as an argument and returns an array of that length (length enforced by the type system). You can partly get there with tuple types but you have to enumerate every length explicitly. And it only works if you pass a number literal to the function. If you pass a variable of type number, it can’t infer the particular number it is.

munchler 3 days ago

> Instead of writing the type directly, we're calling pickType and using the type returned. The typechecker evaluates pickType True at compile time, sees it equals Nat, and then it knows you're saying myNat has type Nat.

But how does the compiler evaluate `pickType True` at compile time? Does the typechecker contain an interpreter for (some subset of) the language? It seems potentially circular for a compiler to invoke functions defined in the code that it's compiling.

Maxatar 3 days ago

Yes, the compiler contains an interpreter for the entire language, not just a subset. It is circular as you rightly point out, but there's no contradiction here.
Typically the interpreter limits itself to only evaluating pure functions or functions that do not perform any kind of IO, but in principle this restriction is not necessary. For example Jai's compiler allows completely arbitrary execution at compile time including even downloading arbitrary data from the Internet and using that data to make decisions about how to compile code.
- munchler 3 days ago
  
  Thank you for the explanation. This reminds me somewhat of an F# type provider, which can execute arbitrary code at compile time.
  https://learn.microsoft.com/en-us/dotnet/fsharp/tutorials/ty...
jibal 2 days ago

The compilers for Zig, D, Nim, and perhaps other languages contain an interpreter for the whole language that can run at compile time. It's not circular--compiling code and executing code are two quite different things (the interpreter does not contain yet another parser--it works on the AST). You simply can do calculations the results of which can be used in compile-time only contexts such as initializers. The trick in Zig is that the compile-time interpreter supports types as first-class objects, so you can compute types and implement generics this way. In Nim, the interpreter supports computations over the AST. In D, types can be passed as generic arguments, and computed strings can be "mixed in" to the source code and compiled at that point (on-the-fly code generation).

nialv7 3 days ago

Saying dependent types can be erased is like saying math only contains tautologies. Both are _technically_ true.

walterburns 3 days ago

"But we'd like to give a type to this function. In most languages (Haskell, Typescript, C++, Rust, C#, etc.) there is no way to do that.[2] "

Can't you do this with a GADT? With a GADT, the type of the output can be specified by which branch of the GADT the input matches. Seems quite a similar idea.

nathanrf 3 days ago

GADTs let you do similar things in the simplest cases, they just don't work as well for complex constraints.
Like, append: Vector n t -> Vector m t -> Vector (n+m) t. With GADTs you need to manually lift (+) to work on the types, and then provide proofs that it's compatible with the value-level.
Then you often need singletons for the cases that are more complex- if I have a set of integers, and an int x, and I want to constrain that the item is in the set and is also even, the GADT approach looks like:
doThing :: Singleton Int x -> Singleton (Set Int) s -> IsInSet x s -> IsEven x -> SomeOutput
Which looks reasonable, until you remember that the Singleton type has to have a representation which is convenient to prove things about, not a representation that is convenient/efficient to operate on, which means your proofs will be annoying or your function itself will hardly resemble a "naive" untyped implementation.
Dependent types fix this because your values are "just" values, so it would be something like
doThing :: (x : int) -> (s : set int) -> so (is_in s x) -> so (even x) -> SomeOutput
Here, x and s are just plain values, not a weird wrapper, and is_in and event are ordinary (and arbitrary) library functions.
wk_end 3 days ago

I mean, you can trivially do it with a (non-G) ADT, if you're content with wrapping the return value.
OTOH you can do this in TypeScript easily without wrapping; it could return a union type `number | string`.
- codebje 2 days ago
  
  Isn't that just saying you can do dependent types in TypeScript easily so long as you're willing to give up type checking?
  - mpoteat 2 days ago
    
    Actually, it's even simpler: you should just be able to use signature overloading:
    myFunc(x: true): number myFunc(x: false): string
    The article's assertion that TypeScript can't represent this is quite false.
    
    codebje 2 days ago
    
    Alas, no:
    foo.ts:1:10 - error TS2393: Duplicate function implementation.
    
    wk_end 2 days ago
    
    Like this:
    https://www.typescriptlang.org/play/?#code/GYVwdgxgLglg9mABA...
    It's a bit of a hack, though - if you incorrectly implement myFunc so that, say, it returns a string even when x is true, it can violate the overloading assertions without any error or warning.
    
    codebje 2 days ago
    
    Thanks! If anyone else is playing along, to get this to work with tsc I needed to use "--strict".
    Being able to use a function along those lines in a type safe manner is still pretty nifty, even if you can't write it with type safety.
  - wk_end 2 days ago
    
    No, because 1) you can't do (full) dependent types in TypeScript, 2) it's not clear what "dependent types" would even mean without type checking, and 3) using the union type you aren't giving up type checking at all.
    The limitation is what comes later in the article - the dependent type can express the relationship between the argument and the returned type. But that's "out of scope" of the question of "typing a function that can return either a number or a string depending on the boolean argument". It only becomes a problem later on because to use the result you need to test the type (using `typeof`) even if you know you which boolean you passed in; that information doesn't end up tracked by the type system - although there actually are a couple of different ways to do this that satisfy this example in TypeScript, as suggested by some other comments.
    Just using a union type in TypeScript isn't dissimilar to my suggestion of wrapping the result in an ADT; it's just that TypeScript makes this lightweight.
    
    codebje 2 days ago
    
    The dependent type question is, I think, subtly different: "a function whose return type is either a number or a string depending on the boolean argument."
    The union type (or the more or less equivalent Either ADT) don't give you type checking because you can take either (sic) branch no matter what your boolean value was. You're free to write a function that takes false and returns an int, and the type checker will accept that function as correct.
    No matter how you twist and turn in GHC Haskell or TypeScript, you can't prevent that just by the type signature alone, so you have to "trust me bro" that the code does what the comment says.
    For this example that's trivial and laughable, but being able to trust that code does what the compiler-enforced types say and nothing more is perhaps a step towards addressing supply chain attacks.

scrubs 2 days ago

Well written! I learned stuff. Dependent types was new on me. Cheers.

jesse__ 3 days ago

Off the topic of dependent types, but this is way too hilarious to pass up..

> Imagine if pickType did a network request to a server and then returned whatever type the server said. Sounds like a bad idea!

Fucking .. isn't this what LSPs are?! Finally something functional programming people and I can agree on. That does sound like a bad idea.

ChadNauseam 3 days ago

I think it's a bit apples and oranges. I was suggesting that compilation itself should probably not involve nondeterministic network requests. If I understand LSP correctly, it only uses network requests to allow your editor to communicate with a binary that provides type information. But the compiler behavior is unchanged. Honestly LSPs seem pretty reasonable to me.
- jesse__ 3 days ago
  
  What's the fundamental difference between a syntax highlighter and the frontend of a compiler? I would argue not much, apart from the fact that people are willing to have their syntax highlighter randomly explode, but are much more concerned about their compiler randomly exploding.
  - jibal 2 days ago
    
    The fundamental difference is that a compiler's actions determine the semantics of the program; a syntax highlighter has no effect on it. And I don't think the concern here is about things blowing up. (Not that things blowing up isn't concerning, but it just doesn't have anything to do with this discussion.)
  - hyghjiyhu 3 days ago
    
    The fundamental difference is that in his case any conforming compiler has to do the network request. Whereas with LSP it's just an implementation detail of the editor.
    
    jibal 2 days ago
    
    The fundamental difference is that a compiler determines the semantics of the program being compiled, and having that be non-deterministic is disastrous ... whereas a syntax highlighter has no effect on the semantics of the program that is generated from the code being highlighted. Things blowing up is a complete red herring.
    
    jesse__ 3 days ago
    
    My point is "why do we tolerate network requests in LSPs, but not compiler frontends"?
    If you think about it, the LSP is a way of doing libraries, which is dramatically worse than just inventing a binary interface and doing actual dynamic libraries.
    If you follow the LSP model to it's natural conclusion, everything becomes this horrible, disconnected morass of servers talking to each other on your local machine..
    
    skybrian 3 days ago
    
    It's not network requests that are a problem. It's nondeterministic network requests. Build systems do make network requests, but assuming the network is up and the build servers are running, the result of a compile should always be the same. If the server isn't reachable then the build fails.
    Also, while build systems can do stuff over the network, the program isn't allowed to run arbitrary network code at compile time. At most it can import a library at a particular URL. The build system will need to download the library the first time, but it can also cache libraries so no download is needed.
    
    jesse__ 3 days ago
    
    > It's not network requests that are a problem. It's nondeterministic network requests.
    Ohhhhh, right, my bad. I forgot that you could just make network requests deterministic with the __do_it_determinstically flag. Of courseeee, how could I be so silly.
    ..
    Think about it; you just illustrated my point exactly. If the network request is assumed to always complete, with the desired payload,, and the client/server exist on the same machine, then why the fuck is there a network request?? An LSP server is just a library, but instead of being loaded into the process address space like a normal library, there's a separate server for some unknown reason.
    Anyways, /rant
    
    skybrian 2 days ago
    
    A local network connection is the lowest-common-denominator way to do foreign function calls. This allows the language server to be written in any language you like without having to deal with the complexity of linking code written in different languages together. It just starts another process and connects to it.
    For example, VS Code is written in TypeScript and the Go language server is written in Go.
    
    ChadNauseam 3 days ago
    
    Do the client and server necessarily exist on the same machine? I thought vscode had an architecture where the frontend and backend could be on separate machines, although I don't know if the LSP crosses that gap.
    Also, I'm sure there's a way to do it, but if someone wanted to write an LSP in Haskell and the editor was written in JavaScript, I'm not sure it would be as straightforward to load it as a dynamic library as you suggest it should be.
    
    jesse__ 3 days ago
    
    Theoretically, the client/server don't have to be on the same machine, but in practice they mostly are.
    As for the interface.. any language that can't speak the C ABI isn't a real language (read: they all can).. but this is besides the point.
    Counterpoint; when's the last time you used an editor written in JavaScript and an LSP written in Haskell? I bet never.
    
    skybrian 2 days ago
    
    If you install the VSCode plugin for Haskell, I’m pretty sure that’s what you get?
    
    jesse__ 2 days ago
    
    It's funny that I forgot people actually use VSCode .. I think of it as a bad joke.
    Fair point though.
    
    tome 2 days ago
    
    I think VSCode is written in TypeScript, but yes, I found counterpoint very strange too!
  - chongli 3 days ago
    
    A compiler is allowed to halt on a syntax error. Syntax highlighters ideally should not give up on colouring the rest of the document due to mismatched parentheses (or braces or any other delimiter pair).
    
    jibal 2 days ago
    
    Many syntax highlighters in fact do so, but this simply isn't what the point was about network indeterminacy--compilation determines the semantics of the program being compiled, syntax highlighters do not.
    
    jesse__ 3 days ago
    
    Fair point
Quekid5 3 days ago

No, LSPs return the name/metadata of a concrete type. Dependent typing means that the return type of any given function in your (static) program can depend on a runtime value, e.g. user input... In well-defined ways ofc.
- jesse__ 3 days ago
  
  So, you're saying it's outside the scope of an LSP to return information about a dependent type because it's .. not a concrete type? That sounds wrong.
  I can make literally any language support dependent types that have struct, enum, switch, and assert. You make a boxed type (tagged union, algebraic datatype, variant, discriminated union, fucking, whatever), and just assert whenever you pack/unpack it on function entry/exit. I do this all the time.
  In plain English, my quip boils down to 'why do we tolerate network requests in our syntax highlighters, when we don't tolerate them in our compiler frontends?'
  - wredcoll 3 days ago
    
    Because in this world my vim editor gets to benefit from the money and time microsoft spent building a typescript syntax highlighter.
    If I waited for one that spoke the c abi so someone could load it via dll instead of http, well, I'd still be waiting.
IceDane 2 days ago

This is easily the stupidest thing I've read on HN in quite a while, and your continued arrogance in the rest of the thread doesn't help. Quite an achievement for a site so full of people who have a direct line from their ass to the comment section. Congratulations!