gdiamos 19 hours ago

Transmeta made a technology bet that dynamic compilation could beat OOO super scalar CPUs in SPEC.

It was wrong, but it was controversial among experts at the time.

I’m glad that they tried it even though it turned out to be wrong. Many of the lessons learned are documented in systems conferences and incorporated into modern designs, ie GPUs.

To me transmeta is a great example of a venture investment. If it would have beaten Intel at SPEC by a margin, it would have dominated the market. Sometimes the only way to get to the bottom of a complex system is to build it.

The same could be said of scaling laws and LLMs. It was theory before Dario, Ilya, OpenAI, et al trained it.

  • fajitaforce5 12 hours ago

    I was an intel cpu architect when transmeta started making claims. We were baffled by those claims. We were pushing the limit of our pipelines to get incremental gains and they were claiming to beat a dedicated arch on the fly! None of their claims made sense to ANYONE with a shred of cpu arch experience. I think your summary has rose colored lenses, or reflects the layman’s perspective.

    • nostrademons 11 hours ago

      I think this is a classic hill-climbing dilemma. If you start in the same place, and one org has worked very hard and spent a lot of money optimizing the system, they will probably come out on top. But if you start in a different place, reimagining the problem from first principles, you may or may not find yourself with a taller hill to climb. Decisions made very early on in your hill-climbing process lock you in to a path, and then the people tasked with optimizing the system later can't fight the organizational inertia to backtrack and pick a different path. But a new startup can.

      It's worth noting that Google actually did succeed with a wildly different architecture a couple years later. They figured "Well, if CPU performance is hitting a wall - why use just one CPU? Why not put together thousands of commodity CPUs that individually are not that powerful, and then use software to distribute workloads across those CPUs?" And the obvious objection to that is "If we did that, it won't be compatible with all the products out there that depend upon x86 binary compatibility", and Google's response was the ultimate in hubris: "Well we'll just build new products then, ones that are bigger and better than the whole industry." Miraculously it worked, and made a multi-trillion-dollar company (multiple multi-trillion-dollar companies, if you now consider how AWS, Facebook, TSMC, and NVidia revenue depends upon the cloud).

      Transmeta's mistake was that they didn't re-examine enough assumptions. They assumed they were building a CPU rather than an industry. If they'd backed up even farther they would've found that there actually was fertile territory there.

      • cpgxiii 8 hours ago

        > It's worth noting that Google actually did succeed with a wildly different architecture a couple years later. They figured "Well, if CPU performance is hitting a wall - why use just one CPU? Why not put together thousands of commodity CPUs that individually are not that powerful, and then use software to distribute workloads across those CPUs?" And the obvious objection to that is "If we did that, it won't be compatible with all the products out there that depend upon x86 binary compatibility", and Google's response was the ultimate in hubris: "Well we'll just build new products then, ones that are bigger and better than the whole industry." Miraculously it worked, and made a multi-trillion-dollar company (multiple multi-trillion-dollar companies, if you now consider how AWS, Facebook, TSMC, and NVidia revenue depends upon the cloud).

        Except "the cloud" at that point was specifically just a large number of normal desktop-architecture machines. Specifically not a new ISA or machine type, running entirely normal OS and libraries. At no point did Google or Amazon or Microsoft make people port/rewrite all of their software for cloud deployment.

        At the point that Google's "bunch of cheap computers" was new, CPU performance was still rapidly improving. The competition was traditional "big iron" or mainframe systems, and the novelty was in achieving high reliability through distribution, rather than building on fault-tolerant hardware. By the time the rate of CPU performance improvement was slowing in the mid 2000s, large clusters smaller machines were omnipresent in supercomputing and HPC applications.

        The real "new architecture(s)" of this century are GPUs, but much of the development and success of them is the result of many iterations and a lot of convergent evolution.

        • jasonwatkinspdx 6 hours ago

          > At the point that Google's "bunch of cheap computers" was new

          It wasn't even new, people just don't know the history. Inktomi and HotBot were based on a fleet of commodity PC servers with low reliability, whereas other large web properties of the time were buying big iron like Sun E10K. And of course Beowulf clusters were a thing.

          And as far as I know, google's early ethos didn't come as some far sighted strategy, but just the practical reality of Page and Brin building the first versions of their search engine on borrowed/scavenged hardware as grad students and then continuing that trajectory.

          • onecommentman 5 hours ago

            The “bunch of cheap computers” approach was being studied and implemented at the National Labs years before Google showed up. Revisionist history?

            • jasonwatkinspdx 5 hours ago

              Not revisionist I think just more that a lot of people first encountered the concept with the story of Google and don't know it had plenty of precedent.

      • fajitaforce5 7 hours ago

        That’s revisionist. Transmeta set out to write a software like cpu core. That will always lose to dedicated hardware.

      • hinkley 10 hours ago

        > Well we'll just build new products then, ones that are bigger and better than the whole industry.

        With blackjack, and hookers!

    • hinkley 10 hours ago

      The Itanium felt like Intel trying on the same bet - move the speculative and analysis logic into the compiler and off the CPU. But where it differed is that it tried to leave some internal implementation details of that decoding process exposed so the compiler could call it directly, in a way that transmeta didn’t manage.

      I wonder how long before we try it again.

    • foobiekr 6 hours ago

      Even the people on comp.arch at the time were baffled. No one believed it.

      • jasonwatkinspdx 6 hours ago

        The discussions on comp.arch from that era are a gold mine. There were lead architects from the P4 team, from the Alpha team, Linus himself during his Transmeta days... all talking very frankly about the concerns of computer architecture at the time.

    • gdiamos 11 hours ago

      It was risky.

      From my perspective it was more exciting to the programming systems and compiler community than to the computer architecture community.

    • rediguanayum 5 hours ago

      Not completely baffling. Intel made an attempt to create a Transmeta like hybrid software/hardware architecture at the time on one of their "VLIW" processors. It was an expensive experiment that didn't work out.

      • ghaff 5 hours ago

        The i860 I think originally had a mode. But they then went ahead and doubled down on Itanium.

    • empw 10 hours ago

      Wasn't Intel trying to do something similar in Itanium i.e. use software to translate code into VLIW instructions to exploit many parallel execution units? Only they wanted the C++ compiler to do it rather than a dynamic recompiler? At least some people in Intel thought that was a good idea.

      I wonder if the x86 teams at Intel people were similarly baffled by that.

      • BirAdam 7 hours ago

        Itanium wasn’t really focusing on running x86 code. Intel wanted native Itanium software, and x86 execution was a bonus.

      • jasonwatkinspdx 5 hours ago

        Adjacent but not the same bet.

        EPIC aka Itanium was conceived around trace optimizing compilers being able to find enough instruction level parallelism to pack operations into VLIW bundles, as this would eliminate the increasingly complex and expensive machinery necessary to do out of order superscalar execution.

        This wasn't a proven idea at the time, but it also wasn't considered trivially wrong.

        What happened is that the combination of OoO speculation, branch predictors, and fat caches ended up working a lot better than anticipated. In particular branch predictors went from fairly naive assumptions initially to shockingly good predictions on real world code.

        The result is that conventional designs increasingly trounced Itanium as the latter was still baking in the oven. By the time it was shipping it was clear the concept was off target, but at that point Intel/HP et all had committed so much they tried to just bully the market into making it work. The later versions of Itanium ended up adding branch prediction and more cache capacity as a capitulation to reality, but that wasn't enough to save the platform.

        Transmeta was making a slightly different bet, which is that x86 code could be dynamically translated to run efficiently on a VLIW cpu. The goal here was two fold:

        First, to sidestep IP issues around shipping an x86 compatible chip. There's a reason AMD and Cyrix are the only companies to have shipped intel alternatives in volume in that era. Transmeta didn't have the legal cover they did, so this dynamic translation approach sidestepped a lot of potential litigation.

        Second, dynamic translation to VLIW could in theory be more power efficient than a conventional architecture. VLIW at the hardware level is kinda like if a cpu just didn't have a decoder. Everything being statically scheduled also reduces design pressure on register file ports, etc. This is why VLIW is quite successful in embedded DSP style stuff. In theory, because the dynamic translation pays the cost of compiling a block once then calls that block many times, you could get a net efficiency gain despite the cost of the initial translation. Additionally, having access to dynamic profiling information could in theory counterbalance the problems EPIC/Itanium ran into.

        So this also wasn't a trivially bad idea at the time. Transmeta specifically targeted x86 compatible laptops as that was a bit of a sore point in the Wintel world at the time, where the potential power efficiency benefits could motivate sales even if absolute performance still was inferior to intel.

        From what I recall hearing from people who had them at the time, the Transmeta hardware wasn't bad but had the sort of random compatibility issues you'd expect and otherwise wasn't compelling enough to win in the market vs Intel. Note this was also before ARM rose to dominate low power mobile computing.

        Transmeta ultimately failed, but some of their technical concepts in detail have been continued on in how language JITs and GPU shader IRs work today. Or how Apple used translation to migrate off both PowerPC and x86 in turn.

        In both the case of Itanium and Transmeta I'd say it's historically inaccurate to say they were obviously or trivially wrong at the time people made these bets.

    • stinkbeetle 6 hours ago

      I recall one of the biggest concerns around the time was that OOOE techniques would not continue scaling in width or depth, and that other techniques would be needed. This turned out to be true, but it was not some fringe idea -- the entire industry turned on this. Intel designed the narrow and less "brainy" Pentium 4 and hoped to achieve performance with frequency, and with HP they designed the in-order Itanium lines. AMD did some speed demon K9. IBM did the in-order POWER6 that got performance with high frequency and runahead speculative execution. Nvidia did a similar thing to Transmeta too, quite a while later IIRC.

      All failures. Everybody went back to more conventional out of order designs and were able to find ways to keep scaling those.

      I'm sure there were some people at all these companies who were always OOOE proponents and disagreed with these other approaches, but I think your summary has poop colored lenses :) It's a little uncharitable to say their ideas were nonsense. The reality is that this was a very uncertain and exploratory time, and many people with large shreds of cpu arch experience all did wildly different things, and many went down the wrong roads (with hindsight).

    • carabiner 6 hours ago

      what are you doing now? Retired on a farm?

      • onecommentman 5 hours ago

        We’re all retired in some sense. Some on farms, some elsewhere. But in a broader and deeper and more meaningful level, we’re not retired at all…working harder than ever.

        • carabiner 5 hours ago

          I don't think I am retired in any sense.

  • vlovich123 19 hours ago

    I think more about the timing being incorrect - betting on software in an era of exponential hardware growth was unwise (software performance can’t scale that way). The problem is that you need to marry it with a significantly better CPU/architecture because the JIT is about not losing performance while retaining back compat.

    However, if you add it onto a better CPU it’s a fine technique to bet on - case in point Apple’s move away from Intel onto homegrown CPUs.

    • cpgxiii 14 hours ago

      > However, if you add it onto a better CPU it’s a fine technique to bet on - case in point Apple’s move away from Intel onto homegrown CPUs.

      I don't think Apple is a good example here. Arm was extremely well-established when Apple began its own phone/tablet CPU designs. By the time Macs began to transition, much of their developer ecosystem was already familiar.

      Apple's CPUs are actually notably conservative when compared to the truly wild variety of Arm implementations; no special vector instructions (e.g. SVE), no online translation (e.g. Nvidia Denver), no crazy little/big/bigger core complexes.

      • vlovich123 10 hours ago

        I think your focusing on the details and missing my broader point - the JIT technique for translation only works to break out of the instruction set lock-in. It does not improve performance, so betting on that instead of super scalar designs is not wise.

        Transmeta’s CPU was not performance competitive and thus had no path to success.

        And as for Apple itself, they had built the first iPhone on top of ARM to begin with (partially because Intel didn’t see a market). So they were already familiar with ARM before they even started building ARM CPUs. But also the developer ecosystem familiarity is only partially relevant - even in compat mode the M1 ran faster than equivalent contemporary Intel chips. So the familiarity was only needed to unlock the full potential (most of which was done by Apple porting 1p software). But even if they had never switched on ARM support in the M1 the JIT technique (compiled with a better CPU and better unified memory architecture) would still have been fast enough to slightly outcompete Intel chips on performance and battery life - native software just made it 0 competition.

        • musicale an hour ago

          > And as for Apple itself, they had built the first iPhone on top of ARM to begin with (partially because Intel didn’t see a market). So they were already familiar with ARM before they even started building ARM CPUs

          As a co-founder of Advanced RISC Machines Ltd, Apple had a history with ARM dating back to 1990 at least:

          > Arm was officially founded as a company in November 1990 as Advanced RISC Machines Ltd, which was a joint venture between Acorn Computers, Apple Computer (now Apple Inc.), and VLSI Technology (now NXP Semiconductors N.V)

          https://newsroom.arm.com/blog/arm-official-history

          Apple also shipped ARM-based Newton systems from 1993-98, and ARM-based iPods starting in 2001.

        • rugina 9 hours ago

          > partially because Intel didn’t see a market

          I saw some articles saying that Intel saw the market very well, they just could not deliver and rather than admit that, they claimed the CEO decided wrong.

          • vlovich123 9 hours ago

            Both were probably true to some extent but I doubt they wouldn’t have figured out a way to execute given the huge opportunity.

            The mobile CPU market worth is a meaningful chunk of Intel’s overall current market cap and they’re not participating.

        • cpgxiii 5 hours ago

          > I think you're focusing on the details and missing my broader point - the JIT technique for translation only works to break out of the instruction set lock-in. It does not improve performance, so betting on that instead of super scalar designs is not wise.

          > Transmeta’s CPU was not performance competitive and thus had no path to success.

          I think you are operating with a bit too much benefit from hindsight. In a very reductive sense, every time someone has tried to make dynamic ISA translation work, they have done so because they believe their ability to implement their "real" ISA will be superior in some way than their ability to implement the external ISA. Obviously many have failed at this, usually when trying more ambitious designs, but less ambitious designs (perhaps most famously the AMD K5 and its descendants) have succeeded.

          Apple's case is really quite different, in that unlike Transmeta or Nvidia, they already had several generations of CPU implementations on which to base their decisions prior to the point of announcing the macOS x64->arm64 transition, just as they had several generations of Intel hardware to consider when making the PPC->x86 transition.

      • almostgotcaught 13 hours ago

        > no special vector instructions (e.g. SVE)

        Wut - SVE and SME are literally Apple designs (AMX) which have been "back ported".

        • cpgxiii 13 hours ago

          > Wut - SVE and SME are literally Apple designs (AMX) which have been "back ported".

          Literally no Apple CPUs meaningfully support SVE or SVE2. Apple adds what I would say is a relatively "conventional" matrix instructions (AMX) of their own, and now implements SME and SME2, but those are not equivalent to SVE (I call AMX "conventional" in the sense that a fixed-size grid of matrix compute elements is not a particularly new idea, versus variable-sized SIMD which is still quite rare. Really, the only arm64 design with "full fat" SVE support is Fujitsu's a64fx (512-bit vector size); everything else on the very short list of hardware supporting SVE is still stuck with 128-bit vectors.

    • hinkley 10 hours ago

      Would TSMC be further along today, or not, if Transmeta had been thought up five, ten years later? Would Transmeta be farther along for having encountered a more mature TSMC?

      TSMC seems to have made a lot of bones on Arm and Apple’s time.

      • vlovich123 7 hours ago

        No, Transmeta was never a big or significant player. ARM was started in 1985 and Apple was an early meaningful investor. By the turn of the millennium ARM was already well established in the microcontroller, PDA and cell phone space (smartphone and feature phone). Ten years later you already had the iPhone well established. It was just never going to happen for them. TSMC made bones on being a fabricator for all sorts of custom chip designs. It almost didn’t matter who it was as long as there was increasing volume.

    • tracker1 13 hours ago

      Exactly... I think that if you look at the accelerator paths that Apple's chips have for x86 emulation combined with software it's pretty nifty. I do wish these were somewhat standardized/licensed/upstreamed so that other arm vendors could use them in a normalized way.

  • rjsw 18 hours ago

    They were also the first to produce an x86 CPU with an integrated northbridge, they could have pitched it more at embedded and industrial markets where SPEC scores are less important.

    • buildbot 13 hours ago

      They did! There are many transmeta powered thin clients for example.

      • giantrobot 10 hours ago

        And UMPCs. Sony made at least one of the PictureBooks with a Transmeta CPU and IIRC their U1 used it as well.

  • btilly 14 hours ago

    That's kind of the bet they made, but misses a key point.

    Their fundamental idea was that by having simpler CPUs, they could iterate on Moore's law more quickly. And eventually they would win on performance. Not just on a few speculative edge cases, but overall. The dynamic compilation was needed to be able to run existing software on it.

    The first iterations, of course, would be slower. And so their initial market, needed to afford those software generations, would be use cases for low power. Because the complexity of a CISC chip made that a weak point for Intel.

    They ran into a number of problems.

    The first is that the team building that dynamic compilation layer was more familiar with the demands of Linux than Windows, with the result that the compilation worked better for Linux than Windows.

    The second problem was that the "simple iterates faster" also turns out to be true for ARM chips. And the most profitable segments of that low power market turned out to be willing to rewrite their software for that use case.

    And the third problem is that Intel proved to be able to address their architectural shortcomings by throwing enough engineers at the problem to iterate faster.

    If Transmeta had won its bet, they would have completely dominated. But they didn't.

    It is worth noting that Apple pursued a somewhat similar idea with Rosetta. Both in changing to Intel, and later changing to ARM64. With the crucial difference that they also controlled the operating system. Meaning that instead of constantly dynamically compiling, they could rely on the operating system to decide what needs to be compiled, when, and call it correctly. And they also better understood what to optimize for.

    • hedgehog 12 hours ago

      I don't know if the bet was even particularly wrong. If they had done a little better job on performance, capitalized on the pains of Netburst + AMD64 transition, and survived long enough to do integrated 3D graphics and native libraries for Javascript + media decoding it might have worked out fine. That alternate universe might have involved a merger with Imagination when the Kyro was doing poorly and the company had financial pain. We'll never know.

      • btilly 12 hours ago

        I don't either. Even with their problems, they didn't miss by much.

        One key factor against them, though, is that they were facing a company whose long-term CEO had written Only The Paranoid Survive. At that point he had moved from being the CEO to the chairman of the board. But Intel had paranoia about possible existential threats baked into its DNA.

        There is no question that Intel recognized Transmeta as a potential existential threat, and aggressively went after the very low-power market that Transmeta was targeting. Intel quickly created SpeedStep, allowing power consumption to dynamically scale when not under peak demand. This improved battery life on laptops using the Pentium III, without sacrificing peak performance. They went on to produce low power chips like the Pentium M that did even better on power.

        Granted, Intel never managed to match the low power that Transmeta had. But they managed to limit Transmeta enough to cut off their air supply - they couldn't generate the revenue needed to invest enough to iterate as quickly as they needed to. This isn't just a story of Transmeta stumbling. This is also a story of Intel recognizing and heading off a potential threat.

        • choilive 7 hours ago

          I always found it ironic that Intel benignly neglected the mobile CPU/SoC market and also lost their process lead despite this supposed culture of never underestimating the competition. The paranoid Intel of the 80s/90s is clearly not the one that existed going into the 2000s and 2010's

          • hedgehog 5 hours ago

            Intel missed mobile, graphics, and AI, while failing to deliver 10nm, and it was all self-inflicted. They didn't understand what was coming. Transmeta was an easily identified threat to Intel's core CPU products so Intel was more likely to pull out all the stops with above-board competing on product as well as IP infringement and tortious interference. Intel had good risk management in having a team working on evolutions of P6, if that hadn't already been a going concern (see also Timna) coming up with a competitive product in the early 2000s would have been much harder.

    • hinkley 10 hours ago

      Intel was already built on the Pentium at this point. Not as iterable as pure software but decoding x86 instructions to whatever they wanted to do internally sped up a lot of things on its own.

      Perhaps they would have been better off building the decode logic as programmable by making effectively a multicore machine where the translation code ran on its own processor with its own cache, instead of a pure JIT.

      • btilly 9 hours ago

        When you are operating at that level, there is a lot of similarity between software compiled to machine code, and software compiled to a chip design. The differences are that the machine code comes with some extra overhead, and changing chip designs takes more work.

        Fundamentally Transmeta to make iterations of designs quicker, at the cost of some performance overhead. What Intel chose that extra low level performance, at the cost of overhead on iterations. And then Intel made up for the extra cost of iterating designs by having more resources to throw at the problem.

        If Transmeta had equivalent resources to throw at their approach, they would have likely won. But they didn't. And I think that they made the right choices for the situation that they were in.

        Incidentally the idea of programmable microcode on top of the CPU was not original. It was used in all sorts of software systems before them. Such as Java. The first big use that I'm aware of was the IBM 360, back in the 1960s. There are still programs running on mainframes today that fundamentally think that they are running on that virtual machine from the 1960s!

  • CalChris 9 hours ago

    A 700 MHz Crusoe TM5400 delivered roughly the same performance as a 500 MHz Pentium III on SPEC benchmarks.

    Transmeta as a startup (the pitch to early investors) went after high performance and maybe that could be construed as DC could beat OOO. But the released product pivoted (like that hasn't happened before) to low power, hence the name Efficeon. Its killer app was playing a DVD on a laptop on a cross country flight without running out of battery and without frying your own laptop. They basically invented that market which then gave them the great good pleasure of competing with Intel.

    It was a heroic doomed effort. Brought to its attention, Intel quickly adapted to that market. Game, set, match. Ditzel became a VP at Intel.

  • andrewf 9 hours ago

    At the time I recall https://dl.acm.org/doi/pdf/10.1145/301631.301683 being an oft-discussed data point - speeding up DEC Alpha code by recompiling it into different DEC Alpha code using runtime statistics.

    This was commonly cited in forum debates about whether Java and C# could come close to the performance of compiled languages. ("JITs and GCs are fast enough, and runtime stats mean they can even be faster!" was a common refrain, but not actually as true in 1999 as it is in 2025)

  • pshirshov 18 hours ago

    Aren't modern CPUs, essetially, dynamic translators from x86_64 instruction set into internal RISC-like intsruction sets?

    • pizlonator 14 hours ago

      Folks like to say that, but that's not what's happening.

      The key difference is: what is an instruction set? Is it a Turing-complete thing with branches, calls, etc? Or is it just data flow instructions (math, compares, loads and stores, etc)?

      X86 CPUs handle branching in the frontend using speculation. They predict where the branch will go, issue data flow instructions from that branch destination, along with a special "verify that I branched to the right place" instruction, which is basically just the compare portion of the branch. ARM CPUs do the same thing. In both X86 and ARM CPUs, the data flow instructions that the CPU actually executes look different (are lower level, have more registers) than the original instruction set.

      This means that there is no need to translate branch destinations. There's never a place in the CPU that has to take a branch destination (an integer address in virtual memory) in your X86 instruction stream and work out what the corresponding branch destination is in the lower-level data flow stream. This is because the data flow stream doesn't branch; it only speculates.

      On the other hand, a DBT has to have a story for translating branch destinations, and it does have to target a full instruction set that does have branching.

      That said, I don't know what the Transmeta CPUs did. Maybe they had a low-level instruction set that had all sorts of hacks to help the translation layer avoid the problems of branch destination translation.

      • monocasa 12 hours ago

        > That said, I don't know what the Transmeta CPUs did. Maybe they had a low-level instruction set that had all sorts of hacks to help the translation layer avoid the problems of branch destination translation.

        Fixed guest branches just get turned into host branches and work like normal.

        Indirect guest branches would get translated through a hardware jump address cache that was structured kind of like TLB tag lookups are.

        • pizlonator 12 hours ago

          Thank you for sharing!

          > Fixed guest branches just get turned into host branches and work like normal.

          How does that work in case of self-modifying code, or skewed execution (where the same x86 instruction stream has two totally different interpretations based on what offset you start at)?

          • monocasa 11 hours ago

            Skewed execution are just different traces. Basic blocks don't have a requirement that they don't partially overlap with other basic blocks. You want that anyway for optimization reasons even without skewed execution.

            Self modifying code is handled with MMU traps on the writes, and invalidation of the relevant traces. It is very much a slow path though. Ideally heavy self modfying code is able to stay in the interpreter though and not thrash in and out of the compiler.

            • pizlonator 10 hours ago

              > Self modifying code is handled with MMU traps on the writes, and invalidation of the relevant traces. It is very much a slow path though. Ideally heavy self modfying code is able to stay in the interpreter though and not thrash in and out of the compiler.

              This might end up having a bad time running JavaScript VM JITed code, which self-modifies a lot.

              But all of that makes sense! Thanks!

              • monocasa 10 hours ago

                Yeah, nesting JITs was kind of always an Achilles heel of this kind of architecture.

                IIRC, they had a research project to look at shipping a custom JVM that compiled straight to their internal ISA to skip the impedance mismatch between two JITs. JITed JS (or really any extremely dynamic code that also asks for high perf) probably wasn't even on their radar given the era with even the SmallTalk VM that HotSpot derived from being a strongly typed derivative of SmallTalk.

      • stinkbeetle 5 hours ago

        This is not true. x86 CPUs have long had micro-op caches that support taken branches which do not result in icache fetches. Probably started with Pentium4's trace cache which was perhaps a little more similar to Transmeta's design, but modern x86 CPUs from Intel and AMD both do dynamic translation from x86 to an internal instruction format that includes branches and likely has some transformation (e.g., some fusion and perhaps cracking).

        The motivations and mechanics and performance characteristics are all very different than what Transmeta did, but still it is difficult to argue that modern x86 CPUs do not translate x86-64 into their own internal instruction sets even if you have this branching requirement.

    • p_l 18 hours ago

      Not to the same level. Crusoe was, in many ways, more classic CISC than x86 - except it's microcode was actually doing dynamic translation to internal ISA instead of operating like interpreter in old CISCs.

      x86 ISA had the funny advantage of being way closer to RISC than "beloved" CISC architectures of old like m68k or VAX. Many common instructions translate to single "RISCy" instruction for the internal microarchitecture (something AMD noted IIRC in the original K5 with its AMD29050-derived core as "most instructions translate to 1 internal microinstruction, some between 2 to 4"). X86 prefixes are also way simpler than the complicated logic of decoding m68k or VAX. An instruction with multiple prefixes will quite probably decode to single microinstruction.

      That said, there's funny thing in that Transmeta tech survived quite a long way to the point that there were Android tablets, in fact flagship Google ones like Nexus 9, whose CPU was based on it - because nvidia "Denver" architecture used same technology (AFAIK licensed from Transmeta, but don't cite me on this)

      • mananaysiempre 17 hours ago

        > Many common [x86] instructions translate to single "RISCy" instruction for the internal microarchitecture

        And then there are read-modify-write instructions, which on modern CPUs need two address-generation μops in addition to the load one, the store one, and the ALU one. So the underlying load-store architecture is very visible.

        There’s also the part where we’ve trained ourselves out of using the more CISCy parts of x86 like ENTER, BOUND, or even LOOP, because they’ve been slow for ages, and thus they stay slow.

        • p_l 17 hours ago

          Even many of the more complex instructions often can translate into surprisingly short sequences - all sorts of loop structures have now various kinds of optimizations including instruction fusion that probably would not be necessary if we didn't stop using higher level LOOP constructs ;-)

          But for example REP MOVS now is fused into equivalent of using SSE load-stores (16 bytes) or even AVX-512 load stores (64 bytes).

          And of course equivalent of LEA by using ModRM/SIB prefixes is pretty much free with it being AFAIK handled as pipeline step

        • monocasa 12 hours ago

          There's levels of microcode.

          It's not too uncommon for each pipeline stage or so to have their own uop formats as each stage computes what it was designed to and culls what later stages don't need.

          Because of this it's not that weird to see both a single rmw uops at, says the initial decode and microcode layer, that then gets cracked into the different uops for the different functional units later on.

        • o11c 5 hours ago

          Even RMW instructions are very RISC compared to what people used to mean by CISC.

      • taolson 15 hours ago

        >something AMD noted IIRC in the original K5 with its AMD29050-derived core

        Just a small nitpick: I've seen the K5/29050 connection mentioned in a number of places, but the K5 was actually based upon an un-released superscalar 29K project called "Jaguar", not the 29050, which was a single-issue, in-order design.

    • JoshTriplett 14 hours ago

      Modern CPUs still translate individual instructions to corresponding micro-ops, and do a bit of optimization with adjacent micro-ops. Transmeta converted whole regions of code at a time, and I think it tried to do higher-level optimizations.

  • actionfromafar 14 hours ago

    Did anyone try dynamic recompilation from x86 to x86? Like a JIT taking advantage of the fact that the target ISA is compatible with with the source ISA.

    • solarexplorer 14 hours ago

      Yes, I think the conclusion was that it did improve performance on binaries that were not compiled with optimizations, but didn't generate enough gains on optimized binaries to set of the cost of re-compilation.

      https://dl.acm.org/doi/10.1145/358438.349303

      (this is not about x86 but PA-RISC, but the conclusions would likely be very similar...)

    • hinkley 10 hours ago

      I believe it was HP who accidentally tried this while making an early equivalent of Rosetta to deal with a hardware change on their mainframes and mini computers. They modified it to run same-same translations and they did get notable performance improvements by doing so.

      I’m pretty sure this experiment happened before Transmeta existed, or when it was still forming. So it ended up being evidence that what they were doing might work. It also was evidence that Java wasn’t completely insane to exist.

    • tgma 13 hours ago

      Notably VMware and alike in pre-hardware virtualization era did something like that to run x86 programs fast under virtualization instead of interpreting x86 through emulation.

  • bsder 7 hours ago

    > It was wrong, but it was controversial among experts at the time.

    Only for those who fail to study history--which means quite a few people.

    Fox example, when Intel announced Itanium, the DEC architects (and IBM, I heard later) cheered. They knew Intel was about to dump a ton of money chasing a White Elephant.

    Alas, they underestimated the business side and the fact that Itanium, while being a technical disaster, was basically a business success and scared everybody out of the microprocessor business except for IBM.

scq 19 hours ago

One aspect of Transmeta not mentioned by this article is their "Code Morphing" technique used by the Crusoe and Efficeon processors. This was a low level piece of software similar to a JIT compiler that translated x86 instructions to the processor's native VLIW instruction set.

Similar technology was developed later by Nvidia, which had licensed Transmeta's IP, for the Denver CPU cores used in the HTC Nexus 9 and the Carmel CPU cores in the Magic Leap One. Denver was originally intended to target both ARM and x86 but they had to abandon the x86 support due to patent issues.

https://en.wikipedia.org/wiki/Project_Denver

  • lproven 17 hours ago

    Code morphing was fascinating. I had no idea nVidia tried anything similar.

    I always felt Transmeta could have carved out a small but sustained niche by offering even less-efficient "morphing" for other architectures, especially discontinued ones. 680x0, SPARC, MIPS, Alpha, PA-RISC... anything the vendors stopped developing hardware (or competitive hardware) for.

  • SuperscalarMeme 16 hours ago

    So glad someone else also knew about this connection :) Details about Denver are pretty minimal, but this talk at Stanford is one of the most detailed I’ve been able to find for those interested. It’s fascinating stuff with lots of similarities to how Transmeta operated: https://youtu.be/oEuXA0_9feM?si=WXuBDzCXMM4_5YhA

    • Symmetry 15 hours ago

      There was a Hot Chips presentation by them that also gave some good details. Unlike the original Transmeta design they first ran code natively and only recompiled the hot spots.

PaulHoule 14 hours ago

God, I had a manager who was the worst manager I ever had who was the last one to stay at Transmeta to turn off the lights. Between working there and working at DEC he could boast that he'd supervised both Dave Cutler and Linus Torvalds.

One time I had to unravel a race condition and he seemed pissed that it took a few days and when I tried to explain the complexity he told me his name was on a patent for a system that would let several VAXes share a single disk and didn't need a lecture.

  • jagged-chisel 13 hours ago

    “Ah, let’s just put it on a VAX then…”

  • DetroitThrow 14 hours ago

    >he told me his name was on a patent for a system that would let several VAXes share a single disk

    Ha, "We stand on the shoulders of giants"...

noelwelsh 19 hours ago

Didn't Transmeta's technology end up in Apple's PowerPC emulator Rosetta, following the switch to Intel?

IIRC Transmeta's technology came out of HP (?) research into dynamic inlining of compiled code, giving performance comparable to profile-guided optimization without the upfront work. It worked similarly to an inlining JIT compiler, except it was working with already compiled code. Very interesting approach and one I think could be generally useful. Imagine if, say, your machine's bootup process was optimized for the hardware you actually have. I'm going off decades old memories here, so the details might be incorrect.

  • hayley-patton 18 hours ago
    • drob518 18 hours ago

      In the early 1990s, HP had a product called “SoftPC” that was used to emulate x86 on PA-RISC. IIRC, however, this was an OEM product written externally. My recollection of how it worked was similar to what is described in the Dynamo paper. I’m wondering if HP bought the technology and whether Dynamo was a later iteration of it? Essentially, it was a tracing JIT. Regardless, all these ideas ended up morphing into Rosetta (versions 1 and 2), though as I understand it, Rosetta also uses a couple hardware hooks to speed up some cases that would be slow if just performed in software.

      • raw_anon_1111 15 hours ago

        That wasn’t an HP product. It was written by Insignia Solutions and ran on multiple platforms.

        I had it on my Mac LCII in 1992. It barely ran well enough to run older DOS IDEs for college. Later I bought an accelerator (40Mhz 68030) and it ran better.

        https://en.wikipedia.org/wiki/SoftPC

        • hn_acc1 8 hours ago

          IIRC, I had that on my Atari ST as well, and it very slowly booted Dos 3.3 and a few basic programs.. enough for me to use turbo-C or Watcom C to compile a basic .c program to display a .pcx file.

  • nostrademons 18 hours ago

    A lot ended up in HotSpot for the JVM. I know a number of extremely good engineers whose career path went TransMeta -> Sun -> Google.

  • iszomer 18 hours ago

    I remember it being in one of Sony VAIO's product lines called the picturebook, for its small form factor and a swivel webcam.

    • em-bee 17 hours ago

      hat was the first laptop i owned ;-) as a frequent traveler it was a very useful device.

      • iszomer 3 hours ago

        It was my first sub-notebook, coming from a thinkpad 600e, and though my memory is fuzzy from 20+ years ago, I think I used an atheros wifi pcmcia/cardbus? card with it at some point. It was also awkward some times having to carry their external 3.5" floppy drive too.

chihuahua 14 hours ago

I interviewed there around 1997 or 98. As part of the interview process, I had the opportunity to have lunch with Linus Torvalds. (I did not get an offer)

Tepix 17 hours ago

I had a pretty slick Toshiba Libretto L1 from Japan at the time - twice as wide as long, with a 1280x600 display.

Its 600Mhz Transmeta Crusoe CPU was pretty slow, unfortunately. Like a Celeron 333Mhz IIRC.

  • organsnyder 16 hours ago

    I used a Fujitsu Lifebook P-2046 laptop at university. It had an 800Mhz Crusoe chip. IIRC it shipped with 256 MB of RAM, which I eventually upgraded to 384.

    Somehow I managed to tolerate running Gentoo on it. Compiling X, OpenOffice, or Firefox were multi-day affairs. One thing that annoyed me was I could never get the graphics card (an ATI Rage 128 with 4 MB RAM, IIRC) working with acceleration under Linux, and that was when compositing window managers were gaining prevalence; I kept trying to get it working in the hope that it would take a bit of the load off of the struggling CPU.

    Despite the bad performance, it worked really well for a college student: it was great for taking notes, and the batteries (extended main and optical drive bay) would easily last a full day of classes. It wouldn't run Eclipse very well, but most of my CS assignments were done using a text editor, anyways.

bohrbohra 19 hours ago

All I know about Transmeta is that Linus Torvalds moved over from Finland to the USA to work at this startup.

Other than that, it seems to have sunk without a trace.

  • MikeNotThePope 15 hours ago

    I worked at Transmeta. I remember for the launch of one of the Crusoe-powered laptops, there was a bug that prevented the BIOS from booting Linux. Since the laptop was only going to run Windows ME, they didn’t fix it. Of course when Linus got a demo unit to play with, the first thing he did was try to install Linux on it. He let everyone know, and the bug was fixed soon there after.

    • bitwize 14 hours ago

      Back in the day, Linux was less tolerant of incorrect behavior than Windows 9x was, and would crash, terminate a process, or otherwise surface errors at times when Windows 9x would just keep going until the bugs corrupted memory or similar. Having Linus aboard as a technical advisor, soneone to whom you can say "hey, the CPU is crashing here, what's the kernel trying to do at that spot?", alone, probably would have been well worth the money to hire him.

      • monocasa 12 hours ago

        He was also one of the world leaders at the time of people who understood x86 privileged space like the back of their hand, and hadn't signed any AMD or Intel NDAs. Linux was originally designed not to be portable, but as a platform for playing with 386 privileged mode constructs. Portability came later (with Alpha IIRC).

    • Barbing 14 hours ago

      Nice.

      Glad you were a part of it at the time?

      • MikeNotThePope 13 hours ago

        My job was essentially playing videos games for two years to stress test the chips. I was pretty good at Diablo 2 by the end of my run :) It was one of my better jobs!

  • NelsonMinar 14 hours ago

    Not before becoming the worst sort of patent trolls. "in January 2009, Transmeta sold itself to Novafora, who in turn sold the patent portfolio to Intellectual Ventures". (This was long after Linus had left.)

rsynnott 19 hours ago

> But they were still a technology company, and if their plans had gone well, they would have sold their product to dotcoms

I'm not sure that that's really correct; they were very desktop-oriented.

  • drob518 18 hours ago

    Well, they ended up being mobile-oriented, but even that didn’t work. They were definitely not server-oriented and they really couldn’t compete at desktop. Honestly, while the tech was interesting, it wasn’t really solving a problem that anyone was struggling with.

    • organsnyder 16 hours ago

      > it wasn’t really solving a problem that anyone was struggling with

      They did push the envelope on efficiency. My Crusoe-equipped laptop could go six hours on the stock battery (12+ on the extended batteries) back when most laptops struggled to get three.

    • dlcarrier 14 hours ago

      They probably would have worked well as server processor, because they were pretty energy efficient, but they were slow the first time a program was run, but sped up after caching the translation. Most servers run the same software over and over again, so they could have been competitive.

      It would have been an extremely difficult time to enter the market though, because at the time Intel was successfully paying server manufacturers to not offer superior competing products.

CrankyBear 12 hours ago

To me, what's important about Transmeta is that they brought over some kid developer named Linus Torvalds to the States from Finland. He had invented some hobby operating systen. I wonder what ever happened to him. :-)

deater 13 hours ago

at the time, just out of undergrad, I ended up working for the remnants of the #9 Video Card company that had been bought by S3 and was masking a last effort at making a Linux-based Transmeta-powered "web-pad" (tablet): the "Frontpath ProGear" (new management wouldn't let them give it a Beatles related name that #9 equipment used to get)

in any case due to the unfortunate timing of the dot-com implosion it never really went anywhere (I wish I had managed to keep one, they used to appear on ebay occasionally)

the one thing I remember is that it was memory limited, it had 64MB but I think the code-morphing software really wanted 16MB of it which really cut into the available system memory

PaulDavisThe1st 14 hours ago

More interestingly, whatever happened to David "Pardo" Keppel, who PhD thesis and person were somewhat central to Transmeta (at least as far as I remember it). For someone who was doing a CS PhD in the mid 90s, he has a vanishingly tiny online footprint. Not sure if that is inspiring or concerning ...

  • BennyGezerit 5 hours ago

    He's at Intel, I believe. I found his name on some Intel patents from 2021 or so (I worked with him at Transmeta)

  • fullstop 14 hours ago

    Hopefully he is living his best life and doing what he enjoys.

urlgrey 5 hours ago

in 1997 I saw Linus Torvalds speak at UC Berkeley following his move to California to work at Transmeta. I was a computer science undergrad at UC Davis, and took Amtrak to Berkeley along with some friends to see Linus in person. Linux was building momentum, and Linus was a real celebrity to those in the space.

Supporting Linus and the Linux community is a great legacy for Transmeta, even if their products didn't find commercial success.

vjvjvjvjghv 15 hours ago

Just looked up their investments. These were the quaint days when 88 million investment was a lot of money.

hinkley 10 hours ago

I wonder if Transmeta could have ever gotten us to a spot where new instructions could be added to old hardware via firmware updates, allowing us to simplify how code gets compiled for backward compatibility.

jpmattia 15 hours ago

> so IBM handled manufacturing of its first-generation CPUs.

I'm curious: Is there a consensus on which startup companies achieved success using IBM as a fab? or if not a consensus, I'd settle for anecdotes too.

My own company (which built 40G optical transponders) used them back in that era. While the tech was first rate, the pricing was something to behold.

  • jecel 10 hours ago

    My own memory of the events (which might be very wrong) was that a new vice-president of IBM semiconductors decided to drop bulk CMOS and focus exclusively on SOI (Silicon On Insulator). That suddenly left Transmeta without chips to sell. They had to scramble to find a new supplier and design their next generation processor for it (since the Crusoe wasn't portable to any other fabs). They were able to launch their Efficeon on TSMC 130nm (with a later version on Fujitsu 90nm) but the gap in supply was far worse for a startup than it would have been for a big company.

    • wmf 8 hours ago

      That doesn't make any sense. IBM is the last company that would shut down a fab with no warning, breaking a bunch of contracts.

  • wmf 12 hours ago

    Cisco and Cray used IBM fabs for multiple generations in the aughts but they weren't startups. Before the rise of TSMC it was a weird situation where fabless companies were kind of picking up extra capacity from IDMs.

  • dlcarrier 14 hours ago

    I don't know about startups, but the Cell processor in the PS3 and the Xenon processor in the Xbox 360 we both fabricated by IBM.

    • pdw 14 hours ago

      The Nintendo GameCube and Wii also had IBM CPUs.

  • rwmj 15 hours ago

    > the pricing was something to behold

    I guess you mean that not in a good way?

    • noir_lord 14 hours ago

      I'd imagine so, IBM are many things (some of them brilliant) but I don't think anyone's ever accused them of been cheap.

FiddlerClamp 14 hours ago

I remember my Compaq TC1000 well, a pen tablet convertible running Windows very sluggishly with a Transmeta Crusoe processor. Nice promise, execution not so much unfortunately.

  • BennyGezerit 5 hours ago

    A project with Chuck Thacker while he at MSR

axiolite 10 hours ago

Transmeta floundered because Intel simply appropriated their technologies, allowing them to jump ahead in energy efficiency:

https://www.computerworld.com/article/1565866/intel-settles-...

  • axiolite 5 hours ago

      Intel infringed on one of its patents when it inserted a technology called "enhanced SpeedStep" into its models
    
      starting in 1991 (which predates the Pentium Pro) [...] Transmeta was the first company to emphasize that power consumption was going to be a major headache for chip and computer makers.
    
      Transmeta's ideas did spark Intel to look more closely at power consumption.
    
    https://www.cnet.com/tech/tech-industry/transmeta-sues-intel...
Theodores 18 hours ago

I liked the Transmeta web page from before they launched. It was just bare HTML with no styling. It said:

  This page is not here yet.
The product hype and lack of knowledge about what it was meant that nobody knew what to expect. In these hyped expectations, and with Torvalds on board, everyone expected that everything would be different. But it wasn't.

A similar product launch was the Segway, where we went from this incredible vision of everyone on Segways to nobody wanting one.

The hype was part of the problem with Transmeta. Even in it's delivered form it could have found a niche. For example, the network computer was in vogue at the time, thanks to Oracle. A different type of device, like a Chromebook might have worked.

With Torvalds connected to Transmeta and the stealthy development, we never did get to hear about who was really behind Transmeta and why.

  • tyingq 18 hours ago
    • Theodores 18 hours ago

      Thanks for that, I was almost right - This web page is not here yet.

      I still use this as important placeholder text, not that anyone outside HN would get the reference.

  • lproven 17 hours ago

    > I liked the Transmeta web page from before they launched. It was just bare HTML with no styling. It said: > > This page is not here yet.

    I remember that fondly.

    If you did view source there was a comment that said something like:

    No, there are no hidden messages in the source code, either.

  • rvbissell 12 hours ago

    It also said in an html comment,

      there are no secret messages in this html
      there are no tyops in this html
    
    which at the time I took as some inside joke.
  • bobanrocky 7 hours ago

    Network computer - SUN, Not oracle.

    Oracle is not a company anyone would associate with engineering innovations!

  • aleph_minus_one 15 hours ago

    > A similar product launch was the Segway, where we went from this incredible vision of everyone on Segways to nobody wanting one.

    The problem with Segway in Germany was rather the certification for road traffic. Because of the insane red tape involved, the introduction was delayed, and for the same reason nobody thus wanted one.

    • mring33621 14 hours ago

      The children of Segway are still here!

      Electric unicycles and Onewheels!

      And they're really fun!

      • jsight 14 hours ago

        Yeah, for me Segway has been a great lesson in how patents can hold back innovation. It was a niche player that prevented others from trying to innovate on the form factor for a number of years.

        • mring33621 13 hours ago

          True!

          In a similar vein:

          One of the dads at school runs a company that does a nanotech waterproof coating for electronics (backed by patents). I told him that it would be very useful for personal electric vehicles, like electric unicycles. He replied that they looked at that, but decided not to license the tech for that use, because there wasn't enough money in it.

          Sad.

    • cycomanic 10 hours ago

      I seriously doubt that was the problem. The issue was always that these things were essentially a walking aid for the price of a motorbike/small car and were particularly useless in Europe where you had to transverse cobble stone roads or take one onto the metro (good luck with that).

      They were a complete hype product, their projections that they would essentially replace walking and pushbikes were just crazy. I don't think I know anyone who wanted one for more than as a toy.

      As a side note, at a ee department where I was teaching around 10 years ago, one student build one as his final year project. Pretty awesome, he essentially did everything himself from software to all the mechanics/electronics... Worked very well as well.

hinkley 11 hours ago

The last non Apple laptop I had was a Fujitsu lifebook with a Transmeta processor in it. I did way too much work trying to get Linux to use every bit of the hardware. Mostly researching what others had done, but also contributed to the ACPI code - half the buttons didn’t work on Linux because the factory default was broken. Windows had its own, but rather than pilfer that, someone pointed out that later lifebooks had fewer issues so I backported fixes from those, and I think invented one of my own by trying things that seemed reasonable.

I also looked at the TM specific flags that they documented, and was surprised to find some that hadn’t been enabled on Linux despite Linus still working there at the time. They looked to be useful for low power mode, and at that time I was looking for a carry-everywhere laptop with decent run time so I invested in those flags.

Turns out they didn’t do anything observable to the system. Power draw was unphased by flipping these toggles. I don’t believe those changes ever got merged.

But it was the Linux fuckery that convinced me I wanted a bask shell and a Unix CLI and just get shit done without having to fiddle all the time. I had better things to do. So I’ve been on apple since except for Pi, Docker, and work.

lief79 14 hours ago

I recall a coworker being excited several years ago about catching someone lying about their linux experience before their interview. If what they said was true, they'd have to have been working on it during it's first year.

He was then excited after the interview because the individual had been working at transmeta with Linus, and his resume was accurate. He didn't end up working with us, but I wasn't privy to any additional information.

dlcarrier 15 hours ago

TL;DR:

    What happened to Transmeta was that in 2005, Transmeta shifted to licensing intellectual property rather than selling CPUs.
…they became a patent troll
  • sophacles 11 hours ago

    Nonsense. They are licensing IP they created.

    Thats different qualitatively and quantitatively than buying patent rights for cheap (since the even the original patent holders didn't think it was worth much) and suing random people who happen to use a product that may infringe on the patent.