yuliyp 8 hours ago

In Hack, collection objects were one of the biggest early mistakes that the took a huge amount of effort to undo. It turns out that the copy-on-write semantics of PHP array were extremely important for performance and good APIs. Being able to pass arrays to things without fear of mutation allowed for tons of optimizations and not needing to copy things just in case. This is why Hack switched to using `dict`, `vec`, and `keyset` rather than collection objects.

More generally, it's weird to see a whole blog post about generics for PHP not even mentioning Hack's generics designs. A lot of thought and iteration went into this like 5-10 years ago.

See https://docs.hhvm.com/hack/arrays-and-collections/object-col... and https://docs.hhvm.com/hack/arrays-and-collections/vec-keyset...

  • maxloh 6 hours ago

    I thought Hack is dead or not intended for public use anymore.

    But after some quick checking, I learned that Hack is still actively maintained, surprisingly.

    • meindnoch 2 hours ago

      The backend of Facebook's frontend is written in Hack. For longest time, I wouldn't believe it... and then I saw the code with my own eyes.

    • phplovesong 6 hours ago

      It is, and Hack exists ONLY because PHP was not cutting it for a big project like facebook. Its the biggest example out there why should not use PHP for anything close to, say 10% of facebook scale.

      • nolok 5 hours ago

        > Its the biggest example out there why should not use PHP for anything close to, say 10% of facebook scale.

        What a weird take to base your current belief on something that happened more than a decade ago.

        Not only the condition for Hack creation (speed, memory usage and strict type checking) have been fixed a long time ago since php 7.0

        But also if you reach 10% of facebook scale, it doesn't matter what language you used, you will need to rewrite anyway.

        Show me a company where PHP is the issue because they reached 10% of facebook scale, and what you're showing is a company that succeeded thanks to PHP. Applies to other language the same. Picking your stack based on "but what if I reach that scale" has to be the mother of all premature optimisations.

      • 4ndrewl 4 hours ago

        I doubt anyone here is working to 1% of fb scale, let alone 10%.

        The answer to "why Hack" needs to be viewed in the historical context of "when Hack" and what was happening (or not) in the php ecosystem at that time.

        Things have changed a lot since, in terms of performance, language longevity, ecosystem etc. Its a perfectly reasonable language to adopt for many orgs.

        • rob74 3 hours ago

          More specifically, Hack was developed during the time when it looked like the PHP project wasn't going anywhere (it was stuck at version 5.x for ten years while version 6 was in the works, then abandoned, then version 7 was developed based on version 5 and finally released in 2015).

rob74 3 hours ago

So, PHP started off as a loosely typed language, then got types... and now they want to implement generics to have more loosely typed code? But as I understand it, types are still optional, so you can still use untyped variables for "generic" code? I'm probably missing something here, is it because of performance concerns? Or the edge case of absolutely wanting strongly typed PHP throughout (except for the part where they want generics)?

  • purerandomness 2 hours ago

    Today, the vast majority of commercial PHP projects are developed enforcing the use of strong types by static analyzers like PHPStan in the CI pipeline, and having the strict_types declaration set.

    As a community, we've seen enough untyped PHP spaghetti code in the early 2000s and never want to go back there.

  • actionfromafar 2 hours ago

    Generics are not equivalent to loose types.

tobinfekkes 12 hours ago

Can someone smarter than me explain what they mean by "reified generics", "erased generics", and a use case for when to use one over the other?

  • meindnoch 2 hours ago

    With reified generics, the code

      class Foo<X> {
        X x;
      }
    
      Foo<int> fooInt;
      fooInt.x = 5;
      Foo<float> fooFloat;
      fooFloat.x = 5.0;
    
    compiles to:

      class Foo_int {
        int x;
      }
    
      class Foo_float {
        float x;
      }
    
      Foo_int fooInt;
      fooInt.x = 5;
      Foo_float fooFloat;
      fooFloat.x = 5.0;
    
    
    On the other hand, erased generics compiles to this:

      class Foo {
        void* x;
      }
    
      Foo fooInt;
      fooInt.x = new int(5);
      Foo fooFloat;
      fooFloat.x = new float(5.0);
  • gloryjulio 12 hours ago

    Example, Java is using erased generics. Once the code is compiled, the generics information is no longer in the bytecode. List<String> becomes List<>. This is called type erasure.

    C# is using reified generics where this information is preserved. List<String> is still List<String> after compilation

    • branko_d 10 hours ago

      And as a consequence, C# can pack the value types directly in the generic data structure, instead of holding references to heap-allocated objects.

      This is very important both for cache locality and for minimizing garbage collector pressure.

      • kgeist 7 hours ago

        With reified generics, you can also do "new T[]" because the type is known at runtime. With type erasure, you can't do that.

      • svieira 10 hours ago

        And Java has been working on Project Valhalla for ~20 years to retrofit the ability to do this to the existing Java language...

        • pjmlp 4 hours ago

          The goal for Valhalla is value types, reiifed generics if they ever happen is still open.

          The project was announced in July 2014, hardly 20 years.

          Also the reason they are still at it, is how to run old JARs withouth breaking semantics, in a Valhalla enabled JVM.

          Had Oracle wanted to do a Python 3, Valhalla would have been done by now, however we all know how it went down, and Java 9 was already impactful enough to the ecosystem.

          • ygra 3 hours ago

            > The goal for Valhalla is value types, reified generics if they ever happen is still open.

            But if they want the List<int> use case to be fast they basically have to keep this information at runtime and will have to make changes to how objects are laid out in memory. I'm not sure there's a good way around that if you want List<int> to be backed by an int[] and `get` returning an int instead of an Object. This may or may not be available to developers and remain internal to the JVM in the beginning, but I think it's necessary to enable the desired performance gains.

            They also state on the website: »Supplementary changes to Java’s generics will carry these performance gains into generic APIs.«

            • pjmlp 2 hours ago

              Haskell and OCaml are two runtimes that do just good enough with type erasure for how the polymorphic types get implemented across their implementations.

              Probably MLton is the only implementation that actually does it the C++ and Rust way.

              So lets see how far they go.

              I always considered it was a mistake for Java to ignore what GC enabled languages were doing at the time, Eiffel, Modula-3, Oberon and frieds, which they naturally looked into given their influences, but it wasn't deemed necessary for the original Java purposes of being a settop box and applets language.

              Now we have a good case of what happens when we tried to retrofit such critical features after decades of field usage, a lesson that Go folks apparently failed to learn as well.

        • metadat 8 hours ago

          Reified Generics doesn't seem to be a goal mentioned on the project website- Am I missing something?

          https://openjdk.org/projects/valhalla/

          There is an interesting article which mentions reification, but that's all I could locate.

          How We Got the Generics We Have (Or, how I learned to stop worrying and love erasure)

          https://openjdk.org/projects/valhalla/design-notes/in-defens...

          • svieira 5 hours ago

            Reified generics aren't on the board, but a solution to:

            > And as a consequence, C# can pack the value types directly in the generic data structure, instead of holding references to heap-allocated objects.

            is what Project Valhalla is all about. (Java doesn't have a good reason for being able to do `new T` at the moment, but being able to treat a generic container as optimizable-over-structs is an explicit goal).

    • PaulGaspardo 10 hours ago

      Incidentally if you do what they're proposing for PHP in Java (where you define a non-generic subclass of a generic type), the actual generic type parameters actually are in the bytecode, and depending on the static type you use to reference it, may or may not be enforced...

         public class StringList extends java.util.ArrayList<String> {
             public static void main(String[] args) throws Exception {
                 StringList asStringList = new StringList();
                 java.util.ArrayList<Integer> asArrayList = (java.util.ArrayList<Integer>)(Object)asStringList;
                 System.out.println("It knows it's an ArrayList<String>: " + java.util.Arrays.toString(((java.lang.reflect.ParameterizedType)asArrayList.getClass().getGenericSuperclass()).getActualTypeArguments()));
                 System.out.println("But you can save and store Integers in it:");
                 asArrayList.add(42);
                 System.out.println(asArrayList.get(0));
                 System.out.println(asArrayList.get(0).getClass());
                 System.out.println("Unless it's static type is StringArrayList:");
                 System.out.println(asStringList.get(0));
             }
         }
      
      That prints out:

         It knows it's an ArrayList<String>: [class java.lang.String]
         But you can save and store Integers in it:
         42
         class java.lang.Integer
         Unless it's static type is StringArrayList:
         Exception in thread "main" java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.String (java.lang.Integer and java.lang.String are in module java.base of loader 'bootstrap')
          at StringList.main(StringList.java:11)
  • Gibbon1 12 hours ago

    I'm not smarter than you but.

    I believe the terms reified generics and erased generics is the type sweaty donkey ball terminology you get for professional CS academics.

    Sticking my neck out further.

    Reified generics means the type is available at run time. In C# you can write if(obj.GetType() == typeof(typename))

    Erased generics the type information is not available at run time. That's the way Java does it and it kinda sucks.

    • p1necone 12 hours ago

      Academics invent short names for common (in their field) concepts not because they're 'sweaty' but because if the thing you're going to mention in every second paragraph in a good chunk of the communication you do with other people working on the same topic requires a full sentence to explain you're going to A. get really annoyed at having to type it out all the time and B. probably explain it slightly differently every time and confuse people.

      Academic jargon isn't invented to be elitist, it's invented to improve communication.

      (of course there's a good chance you understand this already, and you're just making a dumb joke, but I figured I'd explain this anyway for the benefit of everyone reading)

      • fuzzy_biscuit 11 hours ago

        I don't take issue with the naming but with the names that feel a bit beyond my ken. "Erased" makes sense when explained but not before. "Reified" is a word I simply do not use so it feels like academia run amok.

        Regardless, I recognize myself as the point of failure, but those names do strike me as academia speak, though better than some/many. <shrug>

        • klodolph 10 hours ago

          Another shrug, but part of it is that the PL community (programming language community) is pretty deep into its own jargon that doesn’t have as much overlap as you might think, with other subfields of computer science.

          People describe a type system as “not well-founded” or “unsound” and those are specific jabs at the axioms, and people talk about “system F” or “type erasure” or “reification”. Polymorphism can be “ad-hoc” or “parametric”, and type parameters can be invariant, covariant, and contravariant. It’s just a lot of jargon and I think the main reason it’s not intuitive to people outside the right fields is that the actual concepts are mostly unfamiliar.

          • bawolff 6 hours ago

            > Another shrug, but part of it is that the PL community (programming language community) is pretty deep into its own jargon that doesn’t have as much overlap as you might think, with other subfields of computer science.

            The word reified dates back to the 1800s. It isn't the most common word, but it also definitely wasn't invented by the programming language community.

    • mrkeen 5 hours ago

      > Erased generics the type information is not available at run time. That's the way Java does it and it kinda sucks.

      In a good statically-typed language you don't need runtime type information. It could be a Void in the bytecode for all I care, as long as it behaves correctly.

      > obj.GetType() == typeof(typename)

      In a statically-typed language, this can be optimised away to a bool at compile time.

      • Gibbon1 2 hours ago

        Oh absolutely not true AT ALL.

    • skissane 12 hours ago

      > Erased generics the type information is not available at run time. That's the way Java does it and it kinda sucks.

      To be more precise: in Java, generics on class/method/field declarations are available at runtime via reflection. The issue is that they aren’t available for instances. So a java.util.ArrayList<java.lang.String> instance is indistinguishable at runtime from a java.util.ArrayList<java.lang.Object> instance

SoftTalker 10 hours ago

In the sense of an affirmative vote, the proper word is "yea."

twiss 11 hours ago

I may be missing something about how the PHP compiler/interpreter works, but I don't quite understand why this is apparently feasible to implement:

    class BlogPostRepository extends BaseRepository<BlogPost> { ... }
    $repo = new BlogPostRepository();
but the following would be very hard:

    $repo = new Repository<BlogPost>();
They write that the latter would need runtime support, instead of only compile time support. But why couldn't the latter be (compile time) syntactic sugar for the former, so to speak?

(As long as you don't allow the generic parameter to be dynamic / unknown at compile time, of course.)

  • jasone 11 hours ago

    The former merely exposes a `BlogPostRepository` class. The latter requires some mechanism for creating a generic object of concrete type, which is a lot bigger change to the implementation. Does each parametrized generic type have its own implementation? Or does each object have sufficient RTTI to dynamically dispatch? And what are the implications for module API data structures? Etc. In other words, this limitation avoids tremendously disruptive implementation impacts. Not pretty, but we're talking PHP here anyway. ;-)

  • baobun 7 hours ago

    Usually you are right. I assume the inability to sugar would be that "because PHP", the value/type of BlogPost can not be derived at compile-time?

johnisgood 4 hours ago

Nay, enough of complexity as it is, for now.

somat 7 hours ago

I am not much of a programmer so I was trying to figure out what generics are. And I am sure they are great, but my inner gremlin goes in sarcastic tone "we want these elaborate type systems, but we also want typeless because that is far more convenient to use, so we invented generics, a method to untype your type system"

But really while I was reading up on what generics are I went, isn't that just python, strongly typed but your functions don't have built in type checks.

  • tibbar 6 hours ago

    As you say, generics really only apply to typed languages, and they help solve very legitimate annoyance in most of those languages -- you often have some library helper or algorithm that can apply to a wide range of things, but those things differ in some way that's irrelevant to the algorithm.

    For example, a mergesort algorithm works on any kind of array, as long as you can compare the elements in the array to each other. There's no point in re-implementing the algorithm for each kind of array. Yet, without generics, you'd need to do just that. At the same time, the generated code for sorting each different kind of array might need to be a little different - comparing strings and floats isn't the same assembly, for instance. So the programming language and compiler work together: you specify the algorithm once in a certain way, and the compiler can generate the right version of algorithm for each way that you need to use it.

    There are many, many good reasons why you might want to work in a typed language, even though specifying the types is a bit of extra book-keeping; generics are one way to keep the pointless work down. Of course, if you can get away with a python script, there's no need to bother with all this typing business just yet, either.

  • masklinn 4 hours ago

    > "we want these elaborate type systems, but we also want typeless because that is far more convenient to use, so we invented generics, a method to untype your type system"

    It's rather the exact opposite. Parametric types are a way to properly type "deeply" instead of just the topmost layer. Just like inference, type parameters don't remove types.

    > isn't that just python, strongly typed but your functions don't have built in type checks.

    That doesn't really make any sense? Static types mean you don't have runtime type checks, since the types are known statically.

    • oaiey 2 hours ago

      I also do not understand the motion that generics are loosely typed or dynamic. I think people mix this up, since it makes the frameworks and libraries more dynamic (in a strictly typed way) but that is a very different dynamic than language dynamic.

calvinmorrison 10 hours ago

write PHP a lot. every day.

I wish we had typed arrays. Totally not gonna happen, theres been RFCs but I have enough boilerplate classes that are like

Class Option Class Options implements Iterator, countable, etc.

Options[0], Options[1], Options[2]

or Options->getOption('some.option.something');

A lot of wrapper stuff like that is semi tedious, the implementation can vary wildly.

Also because a lot of times in php you start with a generic array and decide you need structure around it so you implement a class, then you need an array of class,

Not to mention a bunch of WSDLs that autogenerate ArrayOfString classes...

  • bakje 2 hours ago

    We used to have a lot of classes like that, but for us PHPStan is sufficient and we effectively have generics now through static analysis warning us of improper usage of types in our CI and IDEs.

    Is this not suitable for you?

  • wesammikhail 9 hours ago

    Nailed it.

    This is the core problem with PHP for me.I love PHP and use it every day. Part of that is the strength and versatility of the arrays implementation (i.e. hashmap). However, the problem is always the fact that an array cant be typed.

    IF they could just introduce that, it would solve 80% of user-land issues over night.

noelwelsh 4 hours ago

The introduction to me reads as very confused:

One of the most sought-after features for PHP is Generics: The ability to have a type that takes another type as a parameter. It's a feature found in most compiled languages by now, but implementing generics in an interpreted language like PHP, where all the type checking would have to be done at runtime, has always proven Really Really Hard(tm), Really Really Slow(tm), or both.

* The topic of the article is implementing generics at compile time, but this claims that PHP is not compiled.

* Type checking is orthogonal to compilation vs interpreter.

* Types are not checked at runtime. It is kinda the point of types that they are checked before code runs. Runtime checks are on values. You can reify types at runtime but this breaks a useful property of generics (parametricity) and it prevents the very useful feature of types without a runtime representation (often known as newtypes).

* If you want to use types in the colloquial "dynamic type" meaning as tags on values, and you also want to talk about generics (a feature that only makes sense for types-as-compile-time-properties) you need to be really careful in your terminology or confusion will abound!

  • oaiey 3 hours ago

    PHP afair compiles the textual PHP into some intermediate which is then run. So there is a compilation stage.

    • noelwelsh 2 hours ago

      That's my understanding as well, so I don't understand why the introduction is worded as it is.