Just like with their last release, they only released the architecture and not the weights. It may be useful for analyzing the system if you're a competitor (but from my last dive into it, it seemed like a strict subset of fancier, industry-leading rec systems), or perhaps getting into rec / retrieval systems as a newcomer.
However, this gives roughly zero insight into how Twitter's feed behaves.
Not only no weights. Not sure what people's expectations are but a lot of the time this isn't even valid code with all the redaction they did [1]. I'm confused as to who this is for, this surely isn't the repo they're working on, is it?
Despite not containing more than a few random files, there were headlines everywhere about the "Open Source Tesla Roadster". There were countless comments, Tweets, and posts about how amazing it was that the Roadster was now open source.
None of the people reporting on it or praising it actually looked at the files and realized you couldn't actually build anything other than the HVAC control board for the car.
I can think of like 3 institutions that have reporters who would ask that kind of question (The Register, Ars Technica and 404media) and I don't think Musk is going to be sitting across the table from any of them, ever.
It's rather weird that they would add keys to the source code like this, rather than reading from the environment or some secrets service. Rather than redacting the source, they should refactor to remove the keys from the source
There's no way you got to this bit without skipping over multiple actual redactions, like SQL queries with all of the details replaced with ellipsis. Why are you cherry-picking one innocent instance when you know exactly what the parent comment is talking about?
It'd assume that weights are changing constantly so they'd need to open source a service tweaking the weights in real time rather than the weights themselves...
No, even if you somehow were able to download the corpus of all public X posts. There are many hidden signals that are feature engineered in good recsys, and the stripped-down algo won't be able to replicate them.
the criteria for deciding which posts in comments or feed are spam or should be otherwise be suppressed are unsurprisingly also hidden . It's known that blue checkmark accounts rank above non-verified ones for comments, but I dunno about feed visibiblity.
Unfortunately, this [0] cancels out everything ten-fold. The owner of the website is boosting the content of himself and the people he supports. This did not happen in the old twitter - not even close.
Post Musk Twitter is amazing. It lets you see how stories, opinions that you support or don’t are attacked from all sides and Community noted / @grok fact checked … a lot of UX changes too .. pre Musk, the moderation / banning was biased and arbitrary (who is watching the watchers?) .. my personal fav was to see the special tick removed from journalists ..
I think that's the first time I've seen someone positive about that change. My experience has been by showing blue tick users above others, the experience has become a lot more biased because it's only a certain type of person that pays for Twitter
Twitter/X user here. I agree with GP, it’s better than it was pre-Elon. The For You feed definitely seems less biased, more interesting, and fewer flame wars. I also think the exodus of Bluesky-ers has helped that (for which Elon gets partial credit). Yes, they do seem to have been backfilled by their right-side mirror images, but those people don’t seem to get amplified to the degree their predecessors were.
You are really going to try and say right wingers aren't amplified on twitter? I literally have an account that just follows gaming accounts and I was having to block people throwing out slurs daily
> You are really going to try and say right wingers aren't amplified on twitter?
FWIW, they see it - but interpret it as "Twitter being less biased" now, because from their POV, Twitter had a pro-liberal bias before Musk, and is now trending towards what they consider neutral.
Maybe you're just biased against normal people? According to the last election, most Americans support Trump. You'd expect a certain percent of the general population to be somewhat right-wing and maybe even rudely so - it's not the early 1990s when only people access to the university mainframe that could get online. Since mobile browsers became popular (around 2010?) being able to type with two thumbs is the only real requirement to comment online.
Hearing gamers swear profusely on a daily basis is the experience of anyone who plays a game with chat.
Isn’t that a good thing that you have already created a mental filter about people who pay for it as being of a “certain type” .. the problem with ticks being bestowed upon some journalists is that they become the brokers/influencers by the virtue of simply working for a newspaper .. that power is vaporised now .. tbh the real question is why didn’t twitter pre musk implement community notes .. I mean it’s not such a bleeding edge / hard to execute idea ..
Community Notes was literally built by pre-Elon Twitter: Birdwatch was first announced on August 2020, and it was initially launched on January 2021. On November 2022, Elon rebranded it to Community Notes and made it widely available.
In the same way that someone speaking on behalf of the White House is held to a standard whenever they speak, the same applies to journalists that are representing a newspaper.
Making everyone 'equal' is a political heuristic that IMO presupposes that journalists can't be trusted and are as useful as a random person paying $20/mo.
I'm going to go ahead and say that the last 5 years did in fact show that journalists cannot be trusted. I will agree that random persons paying $20 obliterates what was already an embarrassingly low bar. Really, opening it up just expanded the pool of what was already just influencers/activists. And why did celebrities have the bluecheck? It's probably more useful as a verification mechanism.
Grok is a lickspittle, don't ask it about facts. But the clever thing is that Grok can do meta queries for you ('give me the last 30 users from X who posted about Y using the word...').
The very first question that the article writer said they posed to Grok 3 and Grok 4, "What is currently the biggest threat to Western civilization and how would you mitigate it?", didn't return anything like the simplistic answers in that article. Apparently, the article was politically driven.
When I asked Grok 4, two pages worth of answers were returned, including a table with columns for Threat, Reasoning, and Severity. The article is just plain wrong and fails the very fact-checking that it purported to do.
I’m not sure what point you think you’re making. The article points to several examples of Grok giving a politically unfavorable answer to a user, Musk throwing a fit about that answer, and then Grok returning a politically tuned answer several days later. It’s observation, not some sort of gotcha by the author. Whatever you’re doing with Grok right now is irrelevant in this context.
Nope. Every single system like that will misbehave if given a bad set of weights, or even a random set of weights. I'd go as far as saying that even with "good" weights, it's likely to have some sigma of misbehavior.
author_is_elon, author_is_democrat, and author_is_republican are in fact gone. Now there is grok_politics_neutral, grok_politics_left, and grok_politics_right. This is in addition to a whole group of other categories, such as grok_category_sports and grok_category_music. All are based on annotations by Grok.
Importantly, this file is not used for recommendations. Everything in this file is only used for "metrics tracking purposes to measure how often we serve posts with various attributes." This would also have applied to author_is_elon.
Is this real? We accept that the algorithm may link you abstractly with other people, but I didn’t think they were literally labeling on this level. If you just say “we look for what’s similar and leave it at that”, then there’s much less liability.
This is political targeting. This guy was one of the biggest political donors, how can this fly?
They seem to have dialed the overt Elon boosting down now but it's still conspicuously aligned with his priorities. I just made a fresh burner account to see what the algorithm is primed to push by default nowadays, and about 80% of the feed is anti-immigration ragebait.
I think that's just an accurate and mostly genuine indicator of who is left on twitter nowadays.
The people left on twitter earnestly believe that it is better now that you can shout racial slurs at people, buy your way to the top of any chain, get literally paid for ragebait, and genuinely think this repo is meaningful.
When this started it really put me off X - I'd have tolerated, and almost liked the idea, of a freedom of speeech place. But a place that boosts its owners posts... Nope.
I'm out - it's such a big personal diss of me, I'm not interested any more.
You do realize people officially register as party members right? I have no idea why this upsets you. It's just categorization. I fucking hope my feeds do this, I do not want to see maga trash.
It’s not. The last “algorithm” release was a random grab bag of code which existed in some of the Twitter repo that might have been tangentially related to recommendations/feed.
Anon, when I was looking through this source dump I saw a huge range of timeouts used in various services, do you know if there's any writeup or explanation as to how the engineering team settled on those values?
Even if this is the actual production code at this very second, it won't match prod for long if they continue this pattern of only dropping an update every two years or so.
Honest question. Would you even dare to say you work at Twitter and then spill the beans on some very public lie or misdirection? It’s trivial to match your writing style between your HN comments and your work emails to identify you. Musk is famously a very petty, bitter, and vindictive person with an easy to bruise ego.
I don’t have any knowledge of the reality inside Twitter but I also have no reason to believe the company would be transparent given the many past controversies, or that any one employee would be able to look at this code which has obvious redactions and say “everything else is definitely 100% prod” and not exactly what GP suggested.
because the person you replied to said they worked (past tense) at twitter, unlike you who says they [currently] work (present tense) at twitter
why would they tell someone not working at twitter anymore to stop working at twitter? and how does that amount to "biased, hypocritical, one-way persecution"
I agree in general it isn't. But in this case Musk claimed that was the point of open-sourcing the algorithm. Transparency on what they are or are not suppressing.
When Tesla "open sourced" their patents, they required companies taking them up on it to, not reciprocally, not copy their "designs". So you get access to their patents in exchange for vague restrictions broader than the patent or copyright system.
I browsed through it a bit and these are some details that raised questions or which I found interesting:
There's multiple mentions of slop, for example: SlopsAuthorScoreFeature in HomeTweetTypePredicates. That means everyone gets a slop score between 0 and 1, which makes me wish that it was openly visible and that people with a high slop score would get a little piggy emoji next to their name.
There's a CLIENT_TWEET_TAKE_SCREENSHOT action, which is likely used to keep track of when a (mobile, presumably) client takes a screenshot. I hadn't considered this before, but for a social media app where posts are often shared externally through screenshots, keeping track of this can give you another engagement metric.
They have two types of NSFW filters: isNsfw and isSoftNsfw, but I couldn't figure out the distinction. Other metadata types include: isGore, isViolent, isSpam, isLowQuality, isOcr.
In ContentFeatureAdapter there's a getTweetLengthType function which shows the range for each tweet type. This is used to set TWEET_LENGTH_TYPE elsewhere. I wonder if it would help your virality to switch up your tweet lengths to regularly put out tweets which hit every length category, or if it doesn't significantly affect your potential reach.
There's a hardcoded list of top-level Grok topics [0]. Just mildly interesting to see what they consider to be top-level categories. Anime has achieved a significant cultural victory by getting separated into its own major category.
The timeout values for different service request types varied a lot across the application, which makes me curious about how they settled on those numbers. This is a question I've pondered in the past but haven't gotten around to researching deeply.
Not sure if this is the right place to ask, but why does Bluesky feel so much faster to load and interact with compared to X? On the surface, both have similar interfaces and equally rich content, yet Bluesky consistently feels snappier and more responsive, even though it’s the newer platform.
Newer is generally faster, hasn't had time to accumulate cludges and dead ends from years of evolution. The bigger factor though I would imagine is not having 100 tons of analytics tracking everything.
Iiirc, Twitter uses some mongrel version of React Native on the web. That's why you get the 3 seconds long loading thingie whenever you open a new tab.
This is laudable. But the great thing about Twitter is that you don't have to use the algorithmic "For You" feed at all. You can just use the "Following" feed, which is purely chronological, and doesn't contain any recommended content. This isn't possible on Facebook, which makes it unusable for me.
I don’t use it - does it remember the setting? My recollection is that Facebook would make you switch to that chronological feed manually each time you load the page.
The "following" feed helps, but replies to almost any topic still attract outright white supremacists on X. They have seemingly endless time to fill the site up with their talking points.
It's so disappointing to see that Twitter has only released the source code of their algorithm while all of their competitors have released both algorithms and weights.
Elon’s father was not keen on him leaving South Africa from what I recall. There was an element of daddy doesn’t believe in me. He left against his fathers wishes.
’“He was such a terrible human being,” Elon, 46, told the magazine. “You have no idea.”’
Daddy issues out the ass. This is actually the simplest answer, because people that pursue external validation (in Elon’s case it is very extreme, nothing is ever enough) are a genuine product of child abuse (it’s obvious emotional abuse is a major factor here).
I’m not absolving Elon, just trying to understand the initial state.
Edit:
When Elon finally came home from the hospital, his father berated him. "I had to stand for an hour as he yelled at me and called me an idiot and told me that I was just worthless," Elon recalls. Kimbal, who had to watch the tirade, says it was the worst memory of his life. "My father just lost it, went ballistic, as he often did. He had zero compassion."
Remember, this is a guy that showed up with a toilet to the Twitter offices and fired everyone, got on everyone’s case to sleep in the offices. He is a literal product of child abuse, and while he may not be continuing the cycle with his kids (which is hard to say because he disowned his Trans kid), he is absolutely unleashing the cycle of abuse onto others wherever he goes.
He didn't just disown the Trans kid, he's disowned most of his kids. He literally has the mom's sign a contract that says the kid will have no legal claim as his child.
No it’s not. People who don’t take child abuse seriously say stuff like that. Michael Jackson is another case of a man who regularly brought up his father’s abuse. These are lifelong scars that show up in everything you do in your life if you don’t face it. Both Elon and MJ engage in drug abuse very late into life. If you stop looking at them as extraordinary humans, and just look at them as ordinary humans, you’ll see that they track the path of many with dark upbringings (they become dysfunctional in some form or another if they don’t face the reality of their journey). They have a veeeery fucked up concept of what human worth is because their caretakers disrespected them.
Well, some version of them are open source, which may or may not be what is actually running in prod. AFAIK the patch which made it obsessively bring up white South Africans was never published, and this algorithm repo went over two years between updates so it obviously wasn't tracking prod.
Nope, nor did that repo have the system prompts that brought is MechaHitler, nor the time earlier this year where it started injecting Trump into every completion.
The Grok repo is a smokescreen for deniability (just not particularly plausible).
I would be surprised if that line in the prompt caused it without the other thing they did just before mecha Hitler: Elon created a Twitter thread asking users to submit divisive politically incorrect facts for grok training. It was full of Holocaust denial and white supremecy stuff.
I think Elon said he would release the weights. In a video somewhere. That's what he meant - when the next major version lands, they release the previous one?
People's choices can change, maybe the economic/geopolitical reality of AI race has been impressed upon him, but I think that's what he said.
This post is about the Twitter algorithm. He originally said it would be open source, but he just did a source dump 2 years ago. Now they did a new source dump with updated but heavily redacted code for the For You feed.
As for his claims about opening up Grok: Elon said that they would publish the n-1 weights for Grok. However, he dragged his feet and only recently released the weights for Grok 2. So now we're up to Grok 4 but he has yet to release the weights for Grok 3 despite his claims.
I think the problem with Elon is that he doesn't fully hold himself accountable for his words. If he decided that it was no longer economically viable to share Grok's weights then he should post an update about that. You cannot expect to win the goodwill of claiming to support open source and then continuously drag your feet while refusing to communicate your intentions clearly.
Previous discussions:
25-apr-2022 https://news.ycombinator.com/item?id=31160546 380 comments
31-mar-2023 https://news.ycombinator.com/item?id=35391433 1185 comments
Just like with their last release, they only released the architecture and not the weights. It may be useful for analyzing the system if you're a competitor (but from my last dive into it, it seemed like a strict subset of fancier, industry-leading rec systems), or perhaps getting into rec / retrieval systems as a newcomer.
However, this gives roughly zero insight into how Twitter's feed behaves.
Not only no weights. Not sure what people's expectations are but a lot of the time this isn't even valid code with all the redaction they did [1]. I'm confused as to who this is for, this surely isn't the repo they're working on, is it?
[1] https://github.com/twitter/the-algorithm/blob/main/trust_and...
This is 100% for headlines and Musk to be able to say "we're open" during interviews. Its actual usefulness is not the point
When they "open sourced" the Tesla Roadster the website only had a couple of mostly useless files. Discussion at the time https://news.ycombinator.com/item?id=38383099
Despite not containing more than a few random files, there were headlines everywhere about the "Open Source Tesla Roadster". There were countless comments, Tweets, and posts about how amazing it was that the Roadster was now open source.
None of the people reporting on it or praising it actually looked at the files and realized you couldn't actually build anything other than the HVAC control board for the car.
The reporters should be getting down to the point and asking Elon Musk about the practical usefulness of such a heavily redacted public release.
I can think of like 3 institutions that have reporters who would ask that kind of question (The Register, Ars Technica and 404media) and I don't think Musk is going to be sitting across the table from any of them, ever.
I believe these are the last kicks of a dying horse/bird...
Why you take this so serious? The world is moving on. Nobody will trust anyone with their freedom of speech, ever. Is this so hard to see?
Any centralized solution qucikly implements censoring, starts banning users.
Are you talking about this?
It's rather weird that they would add keys to the source code like this, rather than reading from the environment or some secrets service. Rather than redacting the source, they should refactor to remove the keys from the sourceOne example, that's right. Another one:
and right at the top:There's no way you got to this bit without skipping over multiple actual redactions, like SQL queries with all of the details replaced with ellipsis. Why are you cherry-picking one innocent instance when you know exactly what the parent comment is talking about?
what is your footnote referring to exactly?
I know when I think “open source”, I am always thinking “heavily redacted”.
/s
It'd assume that weights are changing constantly so they'd need to open source a service tweaking the weights in real time rather than the weights themselves...
They could publish a snapshot of any point in time. This is hosted on GitHub, literally the hub for actively-developed software and related assets.
Not an ML expert, but is it feasible to train the weights using the actual Twitter feed as an oracle?
No, even if you somehow were able to download the corpus of all public X posts. There are many hidden signals that are feature engineered in good recsys, and the stripped-down algo won't be able to replicate them.
[dead]
It would cost a fortune in API calls, so it's not practical for anyone except internally at corporate.
the criteria for deciding which posts in comments or feed are spam or should be otherwise be suppressed are unsurprisingly also hidden . It's known that blue checkmark accounts rank above non-verified ones for comments, but I dunno about feed visibiblity.
For all its flaws .. it’s still a step up from how Parag and co used to run twitter
Unfortunately, this [0] cancels out everything ten-fold. The owner of the website is boosting the content of himself and the people he supports. This did not happen in the old twitter - not even close.
[0] https://github.com/twitter/the-algorithm/issues/236
Why? I've never been a twitter user
Post Musk Twitter is amazing. It lets you see how stories, opinions that you support or don’t are attacked from all sides and Community noted / @grok fact checked … a lot of UX changes too .. pre Musk, the moderation / banning was biased and arbitrary (who is watching the watchers?) .. my personal fav was to see the special tick removed from journalists ..
I think that's the first time I've seen someone positive about that change. My experience has been by showing blue tick users above others, the experience has become a lot more biased because it's only a certain type of person that pays for Twitter
Yes, people like me who CAN’T STAND TO SEE THEIR TYPOS etc. up there on display forever.
Twitter/X user here. I agree with GP, it’s better than it was pre-Elon. The For You feed definitely seems less biased, more interesting, and fewer flame wars. I also think the exodus of Bluesky-ers has helped that (for which Elon gets partial credit). Yes, they do seem to have been backfilled by their right-side mirror images, but those people don’t seem to get amplified to the degree their predecessors were.
You are really going to try and say right wingers aren't amplified on twitter? I literally have an account that just follows gaming accounts and I was having to block people throwing out slurs daily
> You are really going to try and say right wingers aren't amplified on twitter?
FWIW, they see it - but interpret it as "Twitter being less biased" now, because from their POV, Twitter had a pro-liberal bias before Musk, and is now trending towards what they consider neutral.
Maybe you're just biased against normal people? According to the last election, most Americans support Trump. You'd expect a certain percent of the general population to be somewhat right-wing and maybe even rudely so - it's not the early 1990s when only people access to the university mainframe that could get online. Since mobile browsers became popular (around 2010?) being able to type with two thumbs is the only real requirement to comment online.
Hearing gamers swear profusely on a daily basis is the experience of anyone who plays a game with chat.
Isn’t that a good thing that you have already created a mental filter about people who pay for it as being of a “certain type” .. the problem with ticks being bestowed upon some journalists is that they become the brokers/influencers by the virtue of simply working for a newspaper .. that power is vaporised now .. tbh the real question is why didn’t twitter pre musk implement community notes .. I mean it’s not such a bleeding edge / hard to execute idea ..
>tbh the real question is why didn’t twitter pre musk implement community notes
They did. Community notes are just the rebranded "Birdwatch" program that predates Musk.
Community Notes was literally built by pre-Elon Twitter: Birdwatch was first announced on August 2020, and it was initially launched on January 2021. On November 2022, Elon rebranded it to Community Notes and made it widely available.
In the same way that someone speaking on behalf of the White House is held to a standard whenever they speak, the same applies to journalists that are representing a newspaper.
Making everyone 'equal' is a political heuristic that IMO presupposes that journalists can't be trusted and are as useful as a random person paying $20/mo.
I'm going to go ahead and say that the last 5 years did in fact show that journalists cannot be trusted. I will agree that random persons paying $20 obliterates what was already an embarrassingly low bar. Really, opening it up just expanded the pool of what was already just influencers/activists. And why did celebrities have the bluecheck? It's probably more useful as a verification mechanism.
>that power is vaporised now
and replaced with something worse.
Then make that two times.
I also think Twitter under Musk is much better, way much more functionality in it.
Grok is a lickspittle, don't ask it about facts. But the clever thing is that Grok can do meta queries for you ('give me the last 30 users from X who posted about Y using the word...').
I’m a big fan of grok fact checks.
it has potential but they need to improve the @grok UI. Twitter is just cluttered with "@grok is this true?" spam and I had to mute it.
“Fact checks,” sure. Grok is finely tuned to its master’s demands and politics: https://archive.ph/G0Y4i
The very first question that the article writer said they posed to Grok 3 and Grok 4, "What is currently the biggest threat to Western civilization and how would you mitigate it?", didn't return anything like the simplistic answers in that article. Apparently, the article was politically driven.
When I asked Grok 4, two pages worth of answers were returned, including a table with columns for Threat, Reasoning, and Severity. The article is just plain wrong and fails the very fact-checking that it purported to do.
I’m not sure what point you think you’re making. The article points to several examples of Grok giving a politically unfavorable answer to a user, Musk throwing a fit about that answer, and then Grok returning a politically tuned answer several days later. It’s observation, not some sort of gotcha by the author. Whatever you’re doing with Grok right now is irrelevant in this context.
I agree i was banned pre Musk i think now its more free and less bans
There might be some value if someone can show that the feed mis-behaves for some selection of weights.
Nope. Every single system like that will misbehave if given a bad set of weights, or even a random set of weights. I'd go as far as saying that even with "good" weights, it's likely to have some sigma of misbehavior.
RIP author_is_elon, we hardly knew ye.
https://github.com/twitter/the-algorithm/issues/236
The file in question is now here: https://github.com/twitter/the-algorithm/blob/main/home-mixe...
author_is_elon, author_is_democrat, and author_is_republican are in fact gone. Now there is grok_politics_neutral, grok_politics_left, and grok_politics_right. This is in addition to a whole group of other categories, such as grok_category_sports and grok_category_music. All are based on annotations by Grok.
Importantly, this file is not used for recommendations. Everything in this file is only used for "metrics tracking purposes to measure how often we serve posts with various attributes." This would also have applied to author_is_elon.
Oh my god lolollol
author_is_elon
author_is_power_user
author_is_democrat
author_is_republican
Republican, Democrat, and Elon.
Wow.
South Park: The Game level of irony.
Rep, Dem, and "America Party".
Is this real? We accept that the algorithm may link you abstractly with other people, but I didn’t think they were literally labeling on this level. If you just say “we look for what’s similar and leave it at that”, then there’s much less liability.
This is political targeting. This guy was one of the biggest political donors, how can this fly?
Yes, he really had twitter change their code to push his tweets more.
They seem to have dialed the overt Elon boosting down now but it's still conspicuously aligned with his priorities. I just made a fresh burner account to see what the algorithm is primed to push by default nowadays, and about 80% of the feed is anti-immigration ragebait.
I think that's just an accurate and mostly genuine indicator of who is left on twitter nowadays.
The people left on twitter earnestly believe that it is better now that you can shout racial slurs at people, buy your way to the top of any chain, get literally paid for ragebait, and genuinely think this repo is meaningful.
It's a massive self selection bias.
Musk is actively pushing white nationalism these days, so maybe he adjusted the algorithm in line with his political priorities: https://bsky.app/profile/harikunzru.bsky.social/post/3lxrqzm...
> This guy was one of the biggest political donors, how can this fly?
The system is rigged. Haven't you noticed yet?
Looks pretty real:
https://github.com/twitter/the-algorithm/blob/7f90d0ca342b92...
When this started it really put me off X - I'd have tolerated, and almost liked the idea, of a freedom of speeech place. But a place that boosts its owners posts... Nope.
I'm out - it's such a big personal diss of me, I'm not interested any more.
You do realize people officially register as party members right? I have no idea why this upsets you. It's just categorization. I fucking hope my feeds do this, I do not want to see maga trash.
I've always wondered - how can I as a non X engineer be sure that the code on GH is actually deployed on their servers?
It’s not. The last “algorithm” release was a random grab bag of code which existed in some of the Twitter repo that might have been tangentially related to recommendations/feed.
Source: worked at Twitter in ML/recsys.
Anon, when I was looking through this source dump I saw a huge range of timeouts used in various services, do you know if there's any writeup or explanation as to how the engineering team settled on those values?
False, this is definitely production code.
Source: I work at Twitter.
This is not believable. It's not syntactically valid Python.
https://github.com/twitter/the-algorithm/blob/c54bec0d4e029f...
"..." all over the place in 2 year old code is production code?
And people who work at X don't say they work at Twitter.
~65k lines added, ~3k removed in span of more than 2 years. Do you guys do anything there?
This does not contradict what GP said.
Even if this is the actual production code at this very second, it won't match prod for long if they continue this pattern of only dropping an update every two years or so.
Honest question. Would you even dare to say you work at Twitter and then spill the beans on some very public lie or misdirection? It’s trivial to match your writing style between your HN comments and your work emails to identify you. Musk is famously a very petty, bitter, and vindictive person with an easy to bruise ego.
I don’t have any knowledge of the reality inside Twitter but I also have no reason to believe the company would be transparent given the many past controversies, or that any one employee would be able to look at this code which has obvious redactions and say “everything else is definitely 100% prod” and not exactly what GP suggested.
[dead]
> Source: I work at Twitter.
Please stop
edit: disregard, misinterpreted
because the person you replied to said they worked (past tense) at twitter, unlike you who says they [currently] work (present tense) at twitter
why would they tell someone not working at twitter anymore to stop working at twitter? and how does that amount to "biased, hypocritical, one-way persecution"
I don’t think that’s the point of open sourcing things, in general
I agree in general it isn't. But in this case Musk claimed that was the point of open-sourcing the algorithm. Transparency on what they are or are not suppressing.
When Tesla "open sourced" their patents, they required companies taking them up on it to, not reciprocally, not copy their "designs". So you get access to their patents in exchange for vague restrictions broader than the patent or copyright system.
Oh, I see. Well, purely on his claim:bs ratio, I'd too take than with a grain of salt :)
you can't, and it's 100% sure it's not this code running in prod
100% huh? That's a bold statement with no supporting evidence.
Already posted above: https://github.com/twitter/the-algorithm/blob/c54bec0d4e029f...
It's redacted.
Claiming that there's no supporting evidence is a bold (and obviously false) claim when the code is 2 years old and heavily redacted.
Sounds like the right tone when discussing a Musk project.
How can you be sure that the machine code that was generated from your C source files actually match the behaviour encoded in them?
https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...
This is essentially useless without the training set or the weights. It's open-source theatre.
I browsed through it a bit and these are some details that raised questions or which I found interesting:
There's multiple mentions of slop, for example: SlopsAuthorScoreFeature in HomeTweetTypePredicates. That means everyone gets a slop score between 0 and 1, which makes me wish that it was openly visible and that people with a high slop score would get a little piggy emoji next to their name.
There's a CLIENT_TWEET_TAKE_SCREENSHOT action, which is likely used to keep track of when a (mobile, presumably) client takes a screenshot. I hadn't considered this before, but for a social media app where posts are often shared externally through screenshots, keeping track of this can give you another engagement metric.
They have two types of NSFW filters: isNsfw and isSoftNsfw, but I couldn't figure out the distinction. Other metadata types include: isGore, isViolent, isSpam, isLowQuality, isOcr.
In ContentFeatureAdapter there's a getTweetLengthType function which shows the range for each tweet type. This is used to set TWEET_LENGTH_TYPE elsewhere. I wonder if it would help your virality to switch up your tweet lengths to regularly put out tweets which hit every length category, or if it doesn't significantly affect your potential reach.
There's a hardcoded list of top-level Grok topics [0]. Just mildly interesting to see what they consider to be top-level categories. Anime has achieved a significant cultural victory by getting separated into its own major category.
The timeout values for different service request types varied a lot across the application, which makes me curious about how they settled on those numbers. This is a question I've pondered in the past but haven't gotten around to researching deeply.
[0] https://github.com/twitter/the-algorithm/blob/c54bec0d4e029f...
I assume soft NSFW is non-hardcore content
Not sure if this is the right place to ask, but why does Bluesky feel so much faster to load and interact with compared to X? On the surface, both have similar interfaces and equally rich content, yet Bluesky consistently feels snappier and more responsive, even though it’s the newer platform.
Newer is generally faster, hasn't had time to accumulate cludges and dead ends from years of evolution. The bigger factor though I would imagine is not having 100 tons of analytics tracking everything.
Iiirc, Twitter uses some mongrel version of React Native on the web. That's why you get the 3 seconds long loading thingie whenever you open a new tab.
No idea, but Twitter is functionally un-usable if you're not logged in.
Could be from lower usage.
sidenote: when do you think they're going to coax GitHub to transfer the `x` username?
https://github.com/x/
Always good to see some Scala in the wild. :)
There are dozens of us!
I tried going through the latest diff, but there is so much boilerplate that I was nt able to find any real insights through skimming.
Has anyone found anything useful? Interesting needle-in-a-haystack problem for LLMs to try as well.
I want a social media where I can ssh into the servers with limited privileges, enough to see what's going on but not cause harm.
I want a social media where all of that is true of its owners.
https://x.com/XEng/status/1965226798460887127
Interesting. So the numbers/fractions of "for you feed isn't working" complaints, and specifically that complaint, is above some threshold?
Just ping Nikita, he'll tell you the current algo.
hope it doesn't involve goat again
This is laudable. But the great thing about Twitter is that you don't have to use the algorithmic "For You" feed at all. You can just use the "Following" feed, which is purely chronological, and doesn't contain any recommended content. This isn't possible on Facebook, which makes it unusable for me.
I don’t use it - does it remember the setting? My recollection is that Facebook would make you switch to that chronological feed manually each time you load the page.
As I said, Facebook doesn't have a chronological feed. Twitter remembers the feed setting.
The "following" feed helps, but replies to almost any topic still attract outright white supremacists on X. They have seemingly endless time to fill the site up with their talking points.
Facebook has a "friends only" feed (Friends button, next to Home), but it's not chronological.
That ignores the way Twitter sorts replies, which always use an algorithmic weight-based listing.
Not if you sort by recency or likes.
On iOS at least, there is a “Sort replies > latest” option which is strictly chronological
so basically pay to play
It's so disappointing to see that Twitter has only released the source code of their algorithm while all of their competitors have released both algorithms and weights.
[dead]
[flagged]
Elon’s father was not keen on him leaving South Africa from what I recall. There was an element of daddy doesn’t believe in me. He left against his fathers wishes.
https://people.com/human-interest/elon-musk-errol-musk-relat...
’“He was such a terrible human being,” Elon, 46, told the magazine. “You have no idea.”’
Daddy issues out the ass. This is actually the simplest answer, because people that pursue external validation (in Elon’s case it is very extreme, nothing is ever enough) are a genuine product of child abuse (it’s obvious emotional abuse is a major factor here).
I’m not absolving Elon, just trying to understand the initial state.
Edit:
When Elon finally came home from the hospital, his father berated him. "I had to stand for an hour as he yelled at me and called me an idiot and told me that I was just worthless," Elon recalls. Kimbal, who had to watch the tirade, says it was the worst memory of his life. "My father just lost it, went ballistic, as he often did. He had zero compassion."
https://www.cbsnews.com/news/book-excerpt-elon-musk-by-walte...
Remember, this is a guy that showed up with a toilet to the Twitter offices and fired everyone, got on everyone’s case to sleep in the offices. He is a literal product of child abuse, and while he may not be continuing the cycle with his kids (which is hard to say because he disowned his Trans kid), he is absolutely unleashing the cycle of abuse onto others wherever he goes.
He didn't just disown the Trans kid, he's disowned most of his kids. He literally has the mom's sign a contract that says the kid will have no legal claim as his child.
Sucks, dads, please show your children that you love them unconditionally.
This is the most shallow take on Elon I have read.
No it’s not. People who don’t take child abuse seriously say stuff like that. Michael Jackson is another case of a man who regularly brought up his father’s abuse. These are lifelong scars that show up in everything you do in your life if you don’t face it. Both Elon and MJ engage in drug abuse very late into life. If you stop looking at them as extraordinary humans, and just look at them as ordinary humans, you’ll see that they track the path of many with dark upbringings (they become dysfunctional in some form or another if they don’t face the reality of their journey). They have a veeeery fucked up concept of what human worth is because their caretakers disrespected them.
https://www.unicef.org/press-releases/nearly-400-million-you...
This is a common thing, not a hot take.
Does it include the bit about white South Africans?
That was Grok, not the X algorithm.
And the Grok prompts are also open source: https://github.com/xai-org/grok-prompts
Well, some version of them are open source, which may or may not be what is actually running in prod. AFAIK the patch which made it obsessively bring up white South Africans was never published, and this algorithm repo went over two years between updates so it obviously wasn't tracking prod.
> which may or may not be what is actually running in prod.
This is the case for every open source software ever.
> AFAIK the patch which made it obsessively bring up white South Africans was never published
Or there never was a specific patch for that purpose, contrary to what you are assuming.
> this algorithm repo went over two years between updates so it obviously wasn't tracking prod.
You are mixing up something. The Grok prompt repo is a different repo from the recommendation algorithm, and the former has been updated regularly.
Did they ever open source in there the stuff that was making Grok search for Musk's opinion before giving an opinion on world news?
Nope, nor did that repo have the system prompts that brought is MechaHitler, nor the time earlier this year where it started injecting Trump into every completion.
The Grok repo is a smokescreen for deniability (just not particularly plausible).
> Nope, nor did that repo have the system prompts that brought is MechaHitler
False, you made that up.
https://decrypt.co/329365/bye-bye-mechahitler-elon-musk-xai-...
> The Grok repo is a smokescreen for deniability (just not particularly plausible).
Completely unfounded conspiracy theory.
I would be surprised if that line in the prompt caused it without the other thing they did just before mecha Hitler: Elon created a Twitter thread asking users to submit divisive politically incorrect facts for grok training. It was full of Holocaust denial and white supremecy stuff.
https://x.com/elonmusk/status/1936493967320953090
Most likely they rolled back finetuning on that thread at the same time they adjusted the prompt.
I think Elon said he would release the weights. In a video somewhere. That's what he meant - when the next major version lands, they release the previous one?
People's choices can change, maybe the economic/geopolitical reality of AI race has been impressed upon him, but I think that's what he said.
This post is about the Twitter algorithm. He originally said it would be open source, but he just did a source dump 2 years ago. Now they did a new source dump with updated but heavily redacted code for the For You feed.
As for his claims about opening up Grok: Elon said that they would publish the n-1 weights for Grok. However, he dragged his feet and only recently released the weights for Grok 2. So now we're up to Grok 4 but he has yet to release the weights for Grok 3 despite his claims.
I think the problem with Elon is that he doesn't fully hold himself accountable for his words. If he decided that it was no longer economically viable to share Grok's weights then he should post an update about that. You cannot expect to win the goodwill of claiming to support open source and then continuously drag your feet while refusing to communicate your intentions clearly.