Cursed Knowledge

504 points by bqmjjx0kac 4 days ago

I loved this the moment I saw it. After looking at an example commit[1], I love it even more. The cursed knowledge entry is committed alongside the fix needed to address it. My first instinct is that every project should have a similar facility. The log is not just cathartic, but turns each frustrating speedbump into a positive learning experience. By making it public, it becomes both a tool for both commiseration and prevention.

1 - https://github.com/savely-krasovsky/immich/commit/aeb5368602...

delusional 4 days ago

I agree, I usually put this sort of information in the commit message itself. That way it's right there if anybody ever comes across the line and wonders "why did he write this terrible code, can't you just ___".
- motorest 4 days ago
  
  As a side note, it's becoming increasingly important to write down this info in places where LLMs can access it with the right context. Unfortunately commit history is not one of those spots.
  - pushedx 4 days ago
    
    There's no reason that an LLM couldn't (or isn't) being trained on commit messages.
    No difference between a git index and any other binary data (like video).
    
    motorest 4 days ago
    
    > There's no reason that an LLM couldn't (or isn't) being trained on commit messages.
    You are arguing that it could. Hypotheticals.
    But getting back to reality, today no coding assistant supports building system prompts from commit history. This means it doesn't. This is a statement of fact, not an hypothetical.
    If you post context in commit messages, it is not used. If you dump a markdown file in the repo, it is used automaticaly.
    What part are you having a hard time understanding?
    
    daveguy 4 days ago
    
    You seem to be confusing the construction of system prompts with "training". Prompts do not change a model's weights or train them in any way. Yes they influence output, but only in the same way different questions to LLMs (user prompts) influence output. Just because it's not available in current user interfaces to use commit messages as a prompt does not mean the model wasn't trained with them. It would be a huge failure for training from version controlled source code to not include the commit messages as part of the context. As that is a natural human language description of what a particular set of changes encompasses (given quality commits, but quality is a different issue).
    
    motorest 4 days ago
    
    > You seem to be confusing the construction of system prompts with "training".
    I'm not. What part are you having a hard time following?
    
    daveguy 3 days ago
    
    > But getting back to reality, today no coding assistant supports building system prompts from commit history. This means it doesn't. This is a statement of fact, not an hypothetical.
    This is a non-sequiteur. Just because coding assistants don't support building system prompts from commit history doesn't mean LLMs and coding assistants aren't trained on commit messages as part of the massive number repositories they're trained on.
    What part are you having a hard time following?
    
    Jarwain 3 days ago
    
    > As a side note, it's becoming increasingly important to write down this info in places where LLMs can access it with the right context. Unfortunately commit history is not one of those spots.
    This is the comment that spawned this tragedy of miscommunication.
    My interpretation of this comment is that no current programming agents/llm tooling utilize commit history as part of their procedure for building context of a codebase for programming.
    It is not stating that it Cannot, nor is it making any assertion on whether these assistants can or cannot be Trained on commit history, nor any assertion about whether commit history is included in training datasets.
    All its saying is that these agents currently do not automatically _use_ commit history when finding/building context for accomplishing a task.
    
    mejutoco 4 days ago
    
    There are MCP Servers that give access to git repo information to any LLM supporting MCP Servers.
    For example:
    >The GitHub MCP Server connects AI tools directly to GitHub's platform. This gives AI agents, assistants, and chatbots the ability to read repositories and code files, manage issues and PRs, analyze code, and automate workflows. All through natural language interactions.
    source: https://github.com/github/github-mcp-server
    
    laggyluke 2 days ago
    
    This is hair-splitting, because it's technically not a part of _system prompt_, but Claude Code can and does run `git log` even without being explicitly instructed to do so, today.
  - freedomben 4 days ago
    
    Also there's a lot of humans that won't look at the commit history, and in many cases if the code has been moved around the commit history is deep and you have to traverse and read potentially quite a few commits. Nothing kills the motivation more than finally finding the original commit and it mentioning nothing of value. For some things it's worth the cost of looking, but it's ineffective often enough that many people won't bother
    
    simpaticoder 4 days ago
    
    The OP solved this problem by generating a well-known url, hosting it publicly, and including a link to the commit in the cursed knowledge inventory.
    
    gsmt 4 days ago
    
    I usually spot these kind of changes through git blame whenever I find a line suspicious and wonder why it was written like that
  - rf15 4 days ago
    
    You are sadly completely missing the point of ever-self-improving automation. Just also use the commit history. Better yet: don't be a bot slave that is controlled and limited by their tools.
    
    motorest 4 days ago
    
    > You are sadly completely missing the point of ever-self-improving automation. Just also use the commit history.
    I don't think you understand the issue you're commenting on.
    It's irrelevant whether you can inject commit history in a prompt.
    The whole point is that today's support for coding assistants does not support this source of data, whereas comments in source files and even README.md and markdown files in ./docs are supported out of the box.
    If you rely on commit history to provide context to your team members, once they start using LLMs this context is completely ignored and omitted from any output. This means you've been providing context that's useles and doesn't have any impact on future changes.
    If you actually want to help the project, you need to pay attention on whether your contributions are impactful. Dumping comments into what amounts to /dev/null has no impact whatsoever. Requiring your team to go way out of their way to include in each prompt extra context from a weird source that may or may not be relevant is a sure way to ensure no one uses it.
    
    rf15 4 days ago
    
    And my answer is: stop being a user when you want to be a developer so bad. Write the tool you need.
    (we certainly did with our company internal tool, but then we're all seniors who only use autocomplete and query mechanisms other than the impractical chat concept)
  - BobaFloutist 4 days ago
    
    That sounds like work someone should get paid to do.

treve 4 days ago

The '50 extra packages' one is wild. The author of those packages has racked up a fuckload of downloads. What a waste of total bandwidth and disk space everywhere. I wonder if it's for clout.

bikeshaving 4 days ago

The maintainer who this piece of “cursed knowledge” is referencing is a member of TC39, and has fought and died on many hills in many popular JavaScript projects, consistently providing some of the worst takes on JavaScript and software development imaginable. For this specific polyfill controversy, some people alleged a pecuniary motivation, I think maybe related to GitHub sponsors or Tidelift, but I never verified that claim, and given how little these sources pay I’m more inclined to believe he just really believes in backwards compatibility. I dare not speak his name, lest I incur the wrath of various influential JavaScript figures who are friends with him, and possibly keep him around like that guy who was trained wrong as a joke in Kung Pow: Enter the Fist. In 2025, I’ve moderated my opinion of him; he does do important maintenance work, and it’s nice to have someone who seems to be consistently wrong in the community, I guess.
- titanomachy 4 days ago
  
  This is Wimp Lo! We trained him wrong on purpose, as a joke.
  Long time since I thought of that movie.
- karel-3d 4 days ago
  
  to save everyone else a search, it's probably ljharb. (I am not a member of JS community, so, come and attack me.)
  - Sammi 4 days ago
    
    Saga starts here:
    https://x.com/BenjaminMcCann/status/1804295731626545547?lang...
    https://github.com/A11yance/axobject-query/pull/354
    Specifically Ben McCann along with other Svelte devs got tired of him polluting their dependency trees with massive amount of code and packages and called him out on it. He doubled down and it blew up and everyone started migrating away from his packages.
    ljharb also does a lot of work on js standards and is the guy you can thank for globalThis. Guy has terrible taste and insists everyone else should abide by it.
    
    karel-3d 4 days ago
    
    this specific saga starts 1 year before that, arguably more insane thread
    https://github.com/A11yance/aria-query/pull/497
    
    Too 3 days ago
    
    Wow. If this is not laying the foundation for a supply chain attack I don’t know what this is.
  - sunaookami 4 days ago
    
    Wow that's some deep rabbit hole. This guy gets paid per XY npm downloads and games the system through this. Awful.
    
    karel-3d 4 days ago
    
    There is apparently a tool, that you can upload your package.json and it will show you how much dependencies are controlled by ljharb
    https://voldephobia.rschristian.dev/
    
    rschristian 14 hours ago
    
    Ha, was wondering why I started getting a few more stars all of a sudden.
    For extra context: I created the tool ~9 months prior to the meltdown as one could vaguely mention an individual trolling over NPM deps and absolutely everyone in the ecosystem with a bit of experience would know who was being referred to, aka, "You Know Who". And, if you dared mention him by name, he'd eventually show up reciting his download counts in endless "appeal to authority"-style arguments, trying to brow-beat people into accepting that he knows more or whatever, ergo, "He Who Must Not Be Named" (at least, if you didn't want him being annoying in your mentions).
    There's a number of "-phobia" apps in the ecosystem and given the negative impact he has on dependency trees, it felt fitting to offer a similar, somewhat satirical, app to detect how much of your dependency tree he controlled.
    
    dvfjsdhgfv 4 days ago
    
    It looks like if I wanted to install a particular piece of software on many modern websites and I didn't have enough resources to hack node itself, talking to this guy would be a logical choice.
    
    karel-3d 4 days ago
    
    Eh, as much as I think this guy has very weird opinions; if he wanted to cause harm, he would do it many years ago. When I started looking him up, he DOES do a lot of good work in the ecosystem. Which makes this more complex issue.
    But, also, he does this "backwards compatibility forever" insanity. I think it's his crusade.
    
    goriv 4 days ago
    
    Damn, I just checked a random express project I built and there are a lot of things underlined in red there. I think the most amazing one is https://www.npmjs.com/package/is-number-object, which has a stupidly large dependency tree.
- jddj 4 days ago
  
  Looking forward to this Jia Tan sequel in a few years' time.
- Havoc 4 days ago
  
  Forgive my ignorance of js matters but how does adding packages improve backward compatibility at all?
  - motorest 4 days ago
    
    > Forgive my ignorance of js matters but how does adding packages improve backward compatibility at all?
    The scheme is based on providing polyfills for deprecated browsers or JavaScript runtimes.
    Here is the recipe.
    - check what feature is introduced by new releases of a browser/JavaScript runtime,
    - put together a polyfill that implements said feature,
    - search for projects that use the newly introduced feature,
    - post a PR to get the project to consume your polyfill package,
    - resort to bad faith arguments to pressure projects to accept your PR arguing nonsense such as "your project must support IE6/nodejs4".
    Some projects accept this poisoned pill, and whoever is behind these polyfill packages further uses their popularity in bad faith arguments ("everyone does it and it's a very popular package but you are a bad developer for not using my package")
    I had the displeasure of stumbling upon PRs where tis character tries to argue that LTS status does not matter at all I'm determining whether a version of node.js should be maintained, and the fact that said old version of node.js suffers from a known security issue is irrelevant because he asserts it's not a real security issue.
    
    Havoc 4 days ago
    
    Thanks for explaining!
Centigonal 4 days ago

It's probably a clout thing, or just a weird guy (Hanlon's Razor), but a particularly paranoid interpretation is that this person is setting up for a massive, multi-pronged software supplychain attack.
- godelski 4 days ago
  
  Those don't have to be mutually exclusive. Often those with clout are targeted for supplychain attacks. Take xz as an example. Doesn't seem unreasonable that a solo dev or small team looks to either sell their projects or transfer them to someone else (often not even with money exchanging hands). Or even how old social media accounts are hacked so that they can appear as legitimate accounts.
  I'm big on Hanlon's Razor too, but that doesn't mean the end result can't be the same.
- motorest 4 days ago
  
  > (...) but a particularly paranoid interpretation is that this person is setting up for a massive, multi-pronged software supplychain attack.
  That person might not be doing it knowingly or on purpose, but regardless of motivations that is definitely what is being done.
  - whilenot-dev 4 days ago
    
    A package "for-each"[0] that depends on a package "is-callable"[1], just to make forEach work on objects? Nope, not buying the goodwill here.
    [0]: https://www.npmjs.com/package/for-each
    [1]: https://www.npmjs.com/package/is-callable
    
    whilenot-dev 4 days ago
    
    To be fair, he himself removed his unnecessary dependency that caused the explosion of dependencies: https://github.com/A11yance/aria-query/commit/ee003d2af54b6b...
    EDIT: Oops, he just did the changelog entry. The actual fix was done by someone else: https://github.com/A11yance/aria-query/commit/f5b8f4c9001ba7...
    
    motorest 4 days ago
    
    Older browsers don't support foreach, so it's not like a polyfill is unheard of
    https://caniuse.com/?search=foreach
    
    whilenot-dev 4 days ago
    
    Are you serious here? It isn't a polyfill, it's supposed to work on plain objects which isn't part of the spec at all. Besides that, Array.prototype.forEach is only unsupported in Android Browser 4.3 (from July 2013) and IE8 (from May 2008). Seems like a weird reasoning to add it to packages in 2025.
    
    motorest 4 days ago
    
    > Are you serious here?
    I am.
    If you check the definition of polyfill, you'll eventually arrive at something like the following:
    > A polyfill is a piece of code (usually JavaScript on the Web) used to provide modern functionality on older browsers that do not natively support it.
    https://developer.mozilla.org/en-US/docs/Glossary/Polyfill
    I think we would agree that foreach fits the definition, happy path, and whole purpose of a polyfill.
    if you read up on forEach, you will notice that Array.prototype.forEach requires objects to be callable.
    https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
    
    whilenot-dev 4 days ago
    
    > I think we would agree that foreach fits the definition, happy path, and whole purpose of a polyfill
    I think you got that all wrong and strongly misinterpret "modern functionality" as some generic library here...
    Runtimes are developed against a certain spec, in this case ECMAScript, and "modern functionality" is meant as addition to iterations of such a spec. As it happens, iterations of specifications and runtimes are seldomly provided by the same entity, so both are moving forward with newly supported features, or more "modern functionality", individually.
    This behavior provokes some difficult challenges for developers. For once, they would like to work against the latest spec with its newest features, but, due to the natural dynamic of various entities doing things in different ways, these developers would also need to support older/other runtimes where such a feature is not (yet) implemented natively. Now, to bridge these supported-feature-gaps developers came up with an interesting concept: Instead of waiting and relying on the runtime to support such a new feature, it might be possible to provide an implementation as workaround, hence the "polyfill".
    So, if something A isn't currently in the spec, nor B even proposed or in discussion to be in the spec, nor C provided by any current runtime (and relied upon by developers), then I'd conclude that such a functionality is not considered to be a polyfill, as it isn't to be seen as workaround for the supported-feature-gaps due to the difference in runtimes.
    
    motorest 4 days ago
    
    > I think you got that all wrong and strongly misinterpret "modern functionality" as some generic library here...
    I didn't. I am telling you exactly why polyfills exist, and why people use them.
    More importantly, I am explaining to you why this scheme is successful.
    You don't need to write any wall of text that adds nothing. Read the facts I laid out, and use that to either understand how things work, or don't. It's your choice.
    
    whilenot-dev 4 days ago
    
    I did just explain to you why this "scheme" in the "for-each"[0] package has nothing to do with the forEach method in the Array object[1] - method VS function for once, doesn't implement a spec'ed feature secondly.
    More generously, I am explaining to you why your definition of a "polyfill" "is [NOT] successful" and isn't how it's commonly understood.
    But you do you, it's fine.
    [0]: https://www.npmjs.com/package/for-each
    [1]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
    
    karel-3d 4 days ago
    
    ljharb has commit rights (edit: maybe? maybe not. not sure) to nodejs itself
    https://github.com/nodejs/node/issues/55918
    so.. I don't know. if he wanted to bad he would already
- nullc 4 days ago
  
  > is setting up for a massive, multi-pronged software supplychain attack
  The problem with this view is that the JS ecosystem is already doing that all on its own without that particular contributor. (as has the rust ecosystem, which slavishly copied JS' bad practices).
  Eliminate the one guy and JS is still pervasively vulnerable to these attacks. The polyfills are the least of it, because at least they should be completely stable and could just be copied into projects. Other dependencies not so much.
fastball 4 days ago

The author is almost certainly ljharb.
- 0x696C6961 4 days ago
  
  I'm convinced he's a rage baiting account. No-one can consistently have such bad takes.
  - antonvs 4 days ago
    
    Your faith in humanity exceeds mine.
smitty1e 4 days ago

It does raise the idea of managed backward compatibility.
Especially if you could control at install time just how far back to go, that might be interesting.
Also an immediately ridiculous graph problem for all but trivial cases.

qdw 4 days ago

One of their line items complains about being unable to bind 65k PostgreSQL placeholders (the linked post calls them "parameters") in a single query. This is a cursed idea to begin with, so I can't fully blame PostgreSQL.

From the linked GitHub issue comments, it looks like they adopted the sensible approach of refactoring their ORM so that it splits the big query into several smaller queries. Anecdotally, I've found 3,000 to 5,000 rows per write query to be a good ratio.

Someone else suggested first loading the data into a temp table and then joining against that, which would have further improved performance, especially if they wrote it as a COPY … FROM. But the idea was scrapped (also sensibly) for requiring too many app code changes.

Overall, this was quite an illuminating tome of cursed knowledge, all good warnings to have. Nicely done!

e1g 4 days ago

Another strategy is to pass your values as an array param (e.g., text[] or int[] etc) - PG is perfectly happy to handle those. Using ANY() is marginally slower than IN(), but you have a single param with many IDs inside it. Maybe their ORM didn’t support that.
motorest 4 days ago

> This is a cursed idea to begin with, so I can't fully blame PostgreSQL.
After going through the list, I was left with the impression that the "cursed" list doesn't really refers to gotchas per se but to lessons learned by the developers who committed them. Clearly a couple of lessons are incomplete or still in progress, though. This doesn't take away from their value of significance, but it helps frame the "curses" as persona observations in an engineering log instead of statements of fact.
fdr 4 days ago

that also popped out at me: binding that many parameters is cursed. You really gotta use COPY (in most cases).
I'll give you a real cursed Postgres one: prepared statement names are silently truncated to NAMEDATALEN-1. NAMEDATALEN is 64. This goes back to 2001...or rather, that's when NAMEDATALEN was increased in size from 32. The truncation behavior itself is older still. It's something ORMs need to know about it -- few humans are preparing statement names of sixty-plus characters.
- antonvs 4 days ago
  
  > few humans are preparing statement names of sixty-plus characters.
  Java developers: hold my beer
  - eadmund 4 days ago
    
    Hey, if I don’t name this class AbstractBeanFactoryVisitorCommandPatternImplementorFactoryFactoryFactorySlapObserver how would you know what it does?
Terr_ 4 days ago

> One of their line items complains about being unable to bind 65k PostgreSQL placeholders (the linked post calls them "parameters") in a single query.
I've actually encountered this one, it involved an ORM upserting lots of records, and how some tables had SQL array-of-T types, where each item being inserted consumes one bind placeholder.
That made it an intermittent/unreliable error, since even though two runs might try to touch the same number of rows and columns, you the number of bind-variables needed for the array stuff fluctuated.
burnt-resistor 4 days ago

Or people who try to send every filename on a system through xargs in a single command process invocation as arguments (argv) without NUL-terminated strings. Just hope there are no odd or corrupt filenames, and plenty of memory. Oopsie. find -print0 with parallel -0/xargs -0 are usually your friends.
Also, sed and grep without LC_ALL=C can result in the fun "invalid multibyte sequence".
Aeolun 4 days ago

I don’t think that makes intuitive sense. Whether I send 50k rows or 10x5k rows should make no difference to the database. But somehow it does. It’s especially annoying with PG, where you just cannot commit a whole lot of small values fast due to this weird limit.

burnt-resistor 4 days ago

- Windows' NTFS Alternate Data Streams (ADS) allows hiding an unlimited number of files in already existing files

- macOS data forks, xattrs, and Spotlight (md) indexing every single removable volume by default adds tons of hidden files and junk to files on said removable volumes. Solution: mdutil -X /Volumes/path/to/vol

- Everything with opt-out telemetry: go, yarn, meilisearch, homebrew, vcpkg, dotnet, Windows, VS Code, Claude Code, macOS, Docker, Splunk, OpenShift, Firefox, Chrome, flutter, and zillions of other corporate abominations

kirici 4 days ago
>opt-out telemetry: go
By default, telemetry data is kept only on the local computer, but users may opt in to uploading an approved subset of telemetry data to https://telemetry.go.dev.
To opt in to uploading telemetry data to the Go team, run:
```
    go telemetry on
```
To completely disable telemetry, including local collection, run:
```
    go telemetry off
```
https://go.dev/doc/telemetry
- burnt-resistor 4 days ago
  Yep, but you're techsplaining to someone who already know this. But still, it's not opt-in. It's always on by default and litters stuff without asking. All that does is create a file but that doesn't remove the traces of all the tracking it leaves behind without asking. This fixes it in a oneliner:
  # mac, bsd, linux, and wsl only (d="${XDG_CONFIG_HOME:-$HOME/.config}/go/telemetry";rm -rf "$d";mkdir -p "$d"&&echo off>"$d/mode")
  - kirici 4 days ago
    
    Like television and telephone, the "tele" (remote) part is the crucial and defining one. Without it, it's just metry.
TheBicPen 4 days ago

Opt-out telemetry is the only useful kind of telemetry
- burnt-resistor 4 days ago
  
  Not useful to me or most users. See, other people besides you have different values like privacy and consent.
- matheusmoreira 4 days ago
  
  The usefulness is completely irrelevant. We do not want any data exfiltration to take place under any circumstances and for any purpose whatsoever.
  We couldn't care less how much money it costs them.
- salawat 4 days ago
  
  Forces me to fork your shit and remove privacy invasive parts. Consider my computer my home, and your telemetry a camera or microphone you're adding to my place.
  If you don't ask me for permission first I have no reason to trust you will maintain any semblance of integrity in the long run.
  - burnt-resistor 4 days ago
    
    Yes, it's the approach and principle of the interaction. If {{software}} asked to opt-in to collect anonymized/generic information, for what purpose(s), and how it was being stored/anonymized to "vote" for feature use/maintenance and how it was definitely not going being used, like not being sold to data-brokers, then I might say "yes".
    Opt-out shows disrespect and that {{user}} is the product.

bigyabai 4 days ago

> Some phones will silently strip GPS data from images when apps without location permission try to access them.

That's no curse, it's a protection hex!

8organicbits 4 days ago

I think this is written unclearly. Looking at the linked issues, the root cause seems to be related to a "all file access" permission, not just fine grained location access.
It seems great that an app without location access cannot check location via EXIF, but I'm surprised that "all file access" also gates access to the metadata, perhaps one selected using the picker.
https://gitlab.com/CalyxOS/platform_packages_providers_Media...
nulld3v 4 days ago

On the other hand, one particular app completely refuses to allow users to remove location information from their photos: https://support.google.com/photos/answer/6153599?hl=en&co=GE...
mynegation 4 days ago

I have no idea what that means but to me it looks like it works as designed.
Muromec 4 days ago

A ward even

thorum 4 days ago

> npm scripts make a http call to the npm registry each time they run, which means they are a terrible way to execute a health check.

Is this true? I couldn’t find another source discussing it. That would be insane behavior for a package manager.

jrasm91 4 days ago

https://docs.npmjs.com/cli/v6/using-npm/config#update-notifi...
https://github.com/npm/cli/blob/5d82d0b4a4bd1424031fb68b4df7...
dgoldstein0 4 days ago

It might be referring to the check if whether npm is up to date so it can prompt you to update if it isn't?
skacekamen 4 days ago

probably an update check? It definitely sometimes shows an update banner

godelski 4 days ago

Looks like they're missing one. I'm pretty sure the discussion goes further back[0,1] but this one has been on going for years and seems to be the main one[2]

  05/26/23(?) Datetimes in EXIF metadata are cursed

[0] https://github.com/immich-app/immich/discussions/2581

[1] https://github.com/immich-app/immich/issues/6623

[2] https://github.com/immich-app/immich/discussions/12292

a96 4 days ago

Datetimes in general have have a tendency to be cursed. Even when they work, something adjacent is going to blow up sooner or later. Especially if it relies on timezones or DST being in the value.

joshdavham 4 days ago

This is awesome! Does anyone else wanna share some of the cursed knowledge they've picked up?

For me, MacOS file names are cursed:

1. Filenames in MacOS are case-INsensitive, meaning file.txt and FILE.txt are equivalent

2. Filenames in MacOS, when saved in NFC, may be converted to NFD

archagon 4 days ago

I'm in the process of writing up a blog post on how network shares on macOS are kind of cursed. Highlights:
* Files on SMB shares sometimes show up as "NDH6SA~M" or similar, even though that's not their filename on the actual network drive. This is because there's some character present in the filename that SMB can't work with. No errors or anything, you just have to know about it.
* SMB seems to copy accented characters in filenames as two Unicode code points, not one. Whereas native macOS filenames tend to use single Unicode code point accents.
* SMB seems to munge and un-munge certain special characters in filenames into placeholders, e.g. * <-> . But not always. Maybe this depends on the SMB version used?
* SMB (of a certain version?) stores symlinks as so-called "XSym" binary files, which automatically get converted back to native symlinks when copied from the network share. But if you try to rsync directly from the network drive instead of going through SMB, you'll end up with a bunch of binary XSym file that you can't really do anything with.
I only found out about these issues through integrity checks that showed supposedly missing files. Horrible!
account42 a day ago

> 1. Filenames in MacOS are case-INsensitive, meaning file.txt and FILE.txt are equivalent
It's much more cursed than that: filenames may or may not be case-sensitive depending on the filesystem.
qingcharles 4 days ago

I created one of the first CDDBs in 1995 when Windows 95 was in beta. It came with a file, IIRC, cdplayer.ini, that contained all the track names you'd typed in from your CDs.
I put out requests across the Net, mostly Usenet at the time, and people sent me their track listings and I would put out a new file every day with the new additions.
Until I hit 64KB which is the max size of an .ini file under Windows, I guess. And that was the end of that project.
burnt-resistor 4 days ago

Yep. Create a case-sensitive APFS or HFS+ volume for system or data, and it guarantees problems.
- nothrabannosir 4 days ago
  
  I’ve done this with my main drive for the last ten or so years and run into not a single problem. I recommend it.
  - burnt-resistor 4 days ago
    
    Then you don't use Time Machine, Migration Assistant, cmake, or a host of other development and systems tools that don't work correctly on case-sense APFS volumes.
    Sorry, but this is terrible advice unsuitable for all audiences. It might seem to work for now but it's walking in a minefield of nonstandard configuration that could bite anytime in the future.
    https://forums.macrumors.com/threads/does-anyone-else-use-a-...
    https://forums.macrumors.com/threads/heads-up-currently-impo...
    https://apple.stackexchange.com/questions/474537/time-machin...
    https://gitlab.kitware.com/cmake/cmake/-/issues/26333
    
    nothrabannosir 4 days ago
    
    I did use Time Machine, but thankfully wised up and left for restic instead. Using Time Machine is worse advice than case insensitive filesystem if you ask me. Inscrutable logs and silent failures.
    I’m sure there are those who found problems, but fact remains that in ten years I never have. What I have found is a lot of warnings against it by people who don’t use it themselves, like the person in the second link.
    I recommend it, and the more people use it, the more people can help fix bugs they encounter (if any?) like in that last link you posted.
    PS the Time Machine error in your third link is apparently about a CI source to a CS target. I hope it’s fair to say: don’t do that?
mdaniel 4 days ago
1 is only true by default, both HFS and APFS have case sensitive options . NTFS also behaves like you described, and I believe the distinction is that the filesystems are case-retentive, so this will work fine:
```
  $ echo yup > README.txt
  $ cat ReAdMe.TXT
  yup
  $ ls
  README.txt
```
Maybe the cursed version of the filesystem story is that goddamn Steam refuses to install on the case sensitive version of the filesystem, although Steam has a Linux version. Asshats

Ygg2 4 days ago

Why is the YAML part cursed? They serialize to same string, no? Both [1] and [2] serialize to identical strings. This seems like the ancient YAML 1.1 parser curse strikes again.

[1] https://play.yaml.io/main/parser?input=ICAgICAgdGVzdDogPi0KI...

[2]https://play.yaml.io/main/parser?input=ICAgICAgdGVzdDogPi0KI...

binary132 4 days ago

This would be a fun github repo. Kind of like Awesome X, but Cursed.

maxbond 4 days ago

> Fetch requests in Cloudflare Workers use http by default, even if you explicitly specify https, which can often cause redirect loops.

This is whack as hell but doesn't seem to be the default? This issue was caused by the "Flexible" mode, but the docs say "Automatic" is the default? (Maybe it was the default at the time?)

> Automatic SSL/TLS (default)

https://developers.cloudflare.com/ssl/origin-configuration/s...

motorest 4 days ago

> This is whack as hell but doesn't seem to be the default?
I don't think so. If you read about what Flexible SSL means, you are getting exactly what you are asking for.
https://developers.cloudflare.com/ssl/origin-configuration/s...
Here is a direct quote of the recommendation on how this feature was designed to be used:
> Choose this option when you cannot set up an SSL certificate on your origin or your origin does not support SSL/TLS.
Furthermore, Cloudflare's page on encryption modes provides this description of their flexible mode.
> Flexible : Traffic from browsers to Cloudflare can be encrypted via HTTPS, but traffic from Cloudflare to the origin server is not. This mode is common for origins that do not support TLS, though upgrading the origin configuration is recommended whenever possible.
So, people go out of their way to set an encryption mode that was designed to forward requests to origin servers that do not or cannot support HTTPS connections, and then are surprised those outbound connections to their origin servers are not HTTPS.
- jrasm91 4 days ago
  
  It was the default at the time so we had no idea this behavior would be applied to a fetch request in a worker. That combined with no other indication that it was happening made it a real PITA to debug.
- maxbond 4 days ago
  
  I get that it's a compatibility workaround (I did look at the docs before posting) but it's a.) super dangerous and b.) apparently was surprising to the authors of this post. I'm gunnuh keep describing "communicate with your backend in plain text and get caught in infinite redirect loops mode" whack but reasonable people may disagree.
  I would like to know how this setting got enabled, however. And I don't think the document should describe it as a "default" if it isn't one.
  - motorest 4 days ago
    
    > I get that it's a compatibility workaround (...) but it's a.) super dangerous (...)
    It's a custom mode where you explicitly configure your own requests to your own origin server to be HTTP instead of HTTPS. Even Cloudflare discourages the use of this mode, and you need to go way out of your way to explicitly enable it.
    > (...) apparently was surprising to the authors of this post.
    The post is quite old, and perhaps Cloudflare's documentation was stale back then. However, it is practically impossible to set flexible mode being aware of what it means and what it does.
    > I would like to know how this setting got enabled, however.
    Cloudflare's docs state this is a custom encryption mode that is not set by default and you need to purposely go to the custom encryption mode config panel to pick this option among half a dozen other options.
    Perhaps this was not how things were done back then, but as it stands this is hardly surprising or a gotcha. You need to go way out of your way to configure Cloudflare to do what amounts to TLS termination at the edge, and to do so you need to skip a bunch of options that enforce https.
    
    maxbond 4 days ago
    
    It seems like you think I'm operating under a misunderstanding as a result of not having looked at the docs. I looked at them before commenting, and described them accurately if tersely in my original comment. We just disagree.
    I didn't mean "I would like to know" in some sort of conspiratorial way, I just thought there was a story to be told there.
bo0tzz 4 days ago

It was indeed the default at the time.

egruy 4 days ago

Reminds me a lot of phenomenal Hadoop and Kerberos: Madness beyond the gates[1], which coincidentally saved me many times from madness. Thanks Steve, I can't fathom what you had to go through to get the cursed knowledge!

1 - https://steveloughran.gitbooks.io/kerberos_and_hadoop/conten...

tonyhart7 4 days ago

ok but this one is not cursed tho (https://github.com/immich-app/immich/discussions/11268)

its valid privacy and security on how mobile OS handle permission

account42 a day ago

It is cursed because now the photo management app needs to ask for the permission to constantly track you instead of only getting location of a limited set of past points where you specifically chose to take a photo. Besides giving malicious photo app developers an excuse for these permissions, it also contributes to permission fatigue by training to give random applications wide permissions.

LeoPanthera 4 days ago

"Some phones will silently strip GPS data from images when apps without location permission try to access them."

Uh... good?

steve_adams_86 4 days ago

I'm torn. Maybe a better approach would be a prompt saying "you're giving access to images with embedded location data. Do you want to keep the location data in the images, or strip the location data in this application?"
I might not want an application to know my current, active location. But it might be useful for it to get location data from images I give it access to.
I do think if we have to choose between stripping nothing or always stripping if there's no location access, this is the correct and safe solution.
- Aurornis 4 days ago
  
  > saying "you're giving access to images with embedded location data. Do you want to keep the location data in the images, or strip the location data in this application?"
  This is a good example of a complex setting that makes sense to the 1% of users who understand the nuances of EXIF embedded location data but confuses the 99% of users who use a product.
  It would also become a nightmare to manage settings a per-image basis.
  - fc417fc802 4 days ago
    
    Not per-image, it would be per-app. The first time it happened it would ask you. There are already quite a few per-app toggles for things like this so it wouldn't be anything new or particularly surprising.
    That said, an alternative to bugging the user might be that when the app makes the call to open the file that call should fail unless the app explicitly passes a flag to strip the location data. That way you protect users without causing needless confusion for developers when things that ought to "just work" go inexplicably wrong for them.
a96 4 days ago

Kind of. But that means any file that goes through that mechanism may be silently modified. Which is evil.
account42 a day ago

It is cursed because now the photo management app needs to ask for the permission to constantly track you instead of only getting location of a limited set of past points where you specifically chose to take a photo. Besides giving malicious photo app developers an excuse for these permissions, it also contributes to permission fatigue by training to give random applications wide permissions.

zzo38computer 4 days ago

> Zitadel is cursed because its custom scripting feature is executed with a JS engine that doesn't support regex named capture groups.

I think sufficiently old version of JavaScript will not have it. It does not work on my computer either. (You should (if you had not already) report this to whoever maintains that program, in order to fix this, if you require that feature.)

> Git can be configured to automatically convert LF to CRLF on checkout and CRLF breaks bash scripts.

Can you tell git that the bash script is a binary file and therefore should not automatically convert the contents of the file?

> Fetch requests in Cloudflare Workers use http by default, even if you explicitly specify https, which can often cause redirect loops.

Is that a bug in Cloudflare? That way of working does not make sense; it should use the protocol you specify. (I also think that HTTP servers should not generally automatically redirect to HTTPS, but that is a different problem. Still, since it does that it means that this bug is more easily found.) (Also, X.509 should be used for authentication, which avoids the problem of accidentally authenticating with an insecure service (or with the wrong service), since that would make it impossible to do.)

> There is a user in the JavaScript community who goes around adding "backwards compatibility" to projects. They do this by adding 50 extra package dependencies to your project, which are maintained by them.

It is a bad idea to add too many dependencies to your project, regardless of that specific case.

> The bcrypt implementation only uses the first 72 bytes of a string. Any characters after that are ignored.

There is a good reason to have a maximum password length (to avoid excessive processing due to a too long password), although the maximum length should still be sufficiently long (maybe 127 bytes is good?), and it should be documented and would be better if it should be known when you try to set the password.

> Some web features like the clipboard API only work in "secure contexts" (ie. https or localhost)

I think that "secure contexts" is a bad idea. I also think that these features should be controlled by user settings instead, to be able to disable and otherwise configure them.

mdaniel 3 days ago

> Can you tell git that the bash script is a binary file and therefore should not automatically convert the contents of the file?
That'd be swatting a fly with a sledgehammer; if you do that, $(git diff) will no longer work which smells important for shell scripts that evolve over time. But I think you were in the right ballpark in that .gitattributes is designed for helping it understand the behavior you wish with eol=lf just for either that file or *.sh *.bash etc https://git-scm.com/docs/gitattributes#Documentation/gitattr...

csours 4 days ago

You can load Java Classes into Oracle DB and run them natively inside the server.

Those classes can call stored procedures or functions.

Those classes can be called BY stored procedures or functions.

You can call stored procedures and functions from server-side Java code.

So you can have a java app call a stored proc call a java class call a stored proc ...

Yes. Yes, this is why they call it Legacy.

mdaniel 4 days ago

That's ok, modern nodejs apps are represented, too, so everyone can get in on the legacy train: https://docs.oracle.com/en/database/oracle/oracle-database/2... and https://docs.aws.amazon.com/AmazonRDS/latest/AuroraPostgreSQ...
or, if the modern job postings are indicative, FastAPI to PG to PY https://www.postgresql.org/docs/17/plpython-funcs.html

physicles 4 days ago

Back in 2011, I wasted an entire afternoon on some string handling code that was behaving very strangely (I don’t remember exactly what the code was).

It wasn’t until I loaded the content into a hex editor that I learned about U+00A0, the non-breaking space. Looks like a space, but isn’t.

mdaniel 4 days ago

Ah, yes, the 90s html was jam packed with   (aka  ) to make things not wrap, and that was stellar good fun for copy-pasting
The other "2020s" problem is some leading unicode marks which are also invisible. I thought it was BOM but those do seem to show up to cat but just a few weeks ago I had a file from a vendor's site that wouldn't parse but that both cat and vim said was fine, only to find the wtf? via the almighty xxd

phreack 4 days ago

Love this. I seem to find a new one every day maintaining an Android app with millions of users. We like to call them "what will we tell the kids" moments. It's a great idea to write them down, I'll probably start doing it!

qbane 4 days ago

Love to see this concept condensed! This kind of knowledge will only emerge only after you dive in your project and surprisingly find things do not work as thought (inevitable if the project is niche enough). Will keep a list like that for every future project.

mitioshi 3 days ago

What a cool idea to have such page for a project. I wish more open source projects adopted this. It's always interesting to read how people resolved complex problems

stogot 4 days ago

This is the best thing I’ve read on hacker news all year

hiAndrewQuinn 4 days ago

>The bcrypt implementation only uses the first 72 bytes of a string. Any characters after that are ignored.

Is there any good reason for this one in particular?

mras0 4 days ago

bcrypt is based on the blowfish cipher which "only" support keys up to 576 bits (72 bytes) in length (actually only 448 bits as spec'ed). Wikipedia has all the details.

Havoc 4 days ago

One can really sense the pain just reading the headings

Also a crypto library that limits passwords to 72 bytes? That’s wild

AstralStorm 4 days ago

It's written with constant memory allocation in mind. Silly of them to use such a small buffer though, make it a configuration option.
- mras0 4 days ago
  
  No, it's due to the construction of bcrypt - it ends up using the password more or less directly as the key for blowfish (the underlying cipher) which is why the limit is there. Check wikipedia for details.
- nothrabannosir 4 days ago
  
  I assumed all hashes are in O(1) space? Is there any that’s not?

worik 4 days ago

dd/mm/yyyy date formats are cursed....

Perhaps it is mm/dd/yyyy (really?!?) that is cursed....

armchairhacker 4 days ago

dd/mm/yyyy is most common worldwide (particularly Europe, India, Australia) followed by yyyy/mm/dd (particularly China, Japan, South Korea).
https://wikipedia.org/wiki/Date_and_time_representation_by_c...
IMO the best format is yyyy/mm/dd because it’s unambiguous (EDIT: almost) everywhere.
- Izkata 4 days ago
  
  For a really cursed one that breaks your last comment, check out Kazakhstan on the list by country: https://en.wikipedia.org/wiki/List_of_date_formats_by_countr...
  > Short format: (yyyy.dd.mm) in Kazakh[95][obsolete source]
  - o11c 4 days ago
    
    Even ISO has used the cursed date format.
    ISO-IR-26 was registered on 1976/25/03.
- fastball 4 days ago
  
  Not only is YYYY/MM/DD unambiguous, but it also sorts correctly by date when you perform a naive alphabetical sort.
  - LoganDark 4 days ago
    
    I believe YYYY-MM-DD is even less ambiguous than YYYY/MM/DD.
    
    a96 4 days ago
    
    Correct. Slashes mean it's a yank date and going to be backwards. Dashes hint that it's going to be (close to) ISO standard.
    
    _Algernon_ 4 days ago
    
    And it doesn't use a path-separator character for the date.
    
    tom_ 4 days ago
    
    Slashes are used for dd/mm/yyyy as well. Dashes are indeed better if you want a separator. or use the separator-free ISO 8601 syntax.
    
    worik 2 days ago
    
    I just use the number of peta-seconds since I was born....
- accrual 4 days ago
  
  I like CCYY-MM-DD because it's also a valid file name on most systems, and using "CCYY" (century + year) instead of "YYYY" feels fancy.
  - Fwirt 4 days ago
    
    Except this could get confusing because the year 1976 (for example) is actually in the 20th century.
    
    accrual 4 days ago
    
    That is a good point. The "ordinal" century doesn't exactly line up with the digits in a "YYYY" format, thus "CCYY" creates some ambiguity depending on how one defines "century".
    I conclude my fanciness of using "CCYY" is not useful. :)
javcasas 4 days ago

mm/dd/yyyy is cursed. You parse it naively with momentjs, and some times it parses (wrong), other times it doesn't parse.
It's the reason our codebase is filled with momentAmerican, parseDateAmerican and parseDatetimeAmerican.
hollerith 4 days ago

mm.dd.yyyy is cursed, too. The not-cursed options are dd.mm.yyyy and mm/dd/yyyy
- dmd 4 days ago
  
  in what world could mm/dd/yyyy not be cursed!? that makes no sense whatsoever.
  - Izkata 4 days ago
    
    It's the US short form, matching the word-month order we always use for regular dates: "August 7, 2025".
    Note the slashes are important, we don't use dots or dashes with this order. That's what GP was getting at.
    
    chrismorgan 4 days ago
    
    > It's the US short form, matching the word-month order we always use for regular dates: "August 7, 2025".
    Counterexample: US Independence Day is called the “Fourth of July”.
    I would agree that, for dates with named months, the US mostly writes “August 8, 2025” and says “August eighth, 2025” (or sometimes “August eight, 2025”, I think?), and other countries mostly write “8 August 2025” and say “the eighth of August, 2025”; but neither is absolute.
    
    Izkata 4 days ago
    
    Not really a counterexample, that's a holiday, not a regular date.
    
    dmd 4 days ago
    
    And it makes absolutely no sense. I've lived with it all my life (I'm an American!) and it has never made any sense to me.
    
    kstrauser 4 days ago
    
    First, I use ISO8601 for everything. This is not me arguing against it.
    But, I think the American-style formatting is logical for everyday use. When you're discussing a date, and you're not a historian, the most common reason is that you're making plans with someone else or talking about an upcoming event. That means most dates you discuss on a daily basis will be in the next 12 months. So starting with the month says approximately when in the next year you're talking about, giving the day next says when in that month, and then tacking on the year confirms the common case that you mean the next occurrence of it.
    When's Thanksgiving? November (what part of the year?) 27 (toward the end of that November), 2025 (this year).
    It's like answering how many minutes are in a day: 1 thousand, 4 hundred, and 40. You could say 40, 400, and 1000, which is still correct, but everyone's going to look at you weirdly. Answer "2025 (yeah, obviously), the 27th (of this month?) of November (why didn't you start with that?)" is also correct, but it sounds odd.
    So 11/27/2025 starts with the most useful information and works its way to the least, for the most common ways people discuss dates with others. I get it. It makes since.
    But I'll still use ISO8601.
    
    out_of_protocol 4 days ago
    
    > So 11/27/2025 starts with the most useful information
    Most useful information would be to not confuse it. E.g. you see a event date 9/8/2025 and it's either tomorrow or a month from now. Perfect 50/50% chance to miss it or make a useless trip
    
    hollerith 4 days ago
    
    Can you explain why on a traffic light, red means stop and green means go? Why not the other way around?
    
    a96 4 days ago
    
    Red is an aggravating colour psychologically. It's pretty universally used as a warning. Red lights in cars also mean "not ready to drive". Brake lights are also red for similar reason. "Seeing red."
    
    worik 2 days ago
    
    ...and red car means "warning! outrageous jersey at the wheel"
    Wait, no it dose not....
    
    fc417fc802 4 days ago
    
    Because it's arbitrary. Unlike a date format where the components have relative meaning to one another, can be sorted based on various criteria, and should smoothly integrate with other things.
    As a US native let me clearly state that the US convention for writing dates is utterly cursed. Our usage of it makes even less sense than our continued refusal to adopt the metric system.
    
    FridgeSeal 4 days ago
    
    The short form doesn’t match the word form though.
    If you wanted a short form to match the word form, you go with something like:
    “mmmm/dd/yyyy”
    Where mmmm is either letters, or a 2-character prefix. The word form “August 7th…” is packing more info that the short form.

burnt-resistor 4 days ago

Install an SP3 or TR4 socketed CPU in a dusty, dirty room without ESD precautions and crank the torque on the top plate and heat sink like truck lug nuts until creaking and cracking noises of the PCB delaminating are noticeable. Also be sure to sneeze on the socket's chip contacts and clean it violently with an oily and dusty microfiber cloth to bend every pin.

c. 2004 and random crap on eBay: DL380 G3 standard NICs plus Cisco switches with auto speed negotiation on both sides have built-in chaos monkey duplex flapping.

Google's/Nest mesh Wi-Fi gear really, really enjoys being close together so much that it offers slower speeds than simply 1 device. Not even half speed, like dial-up before 56K on random devices randomly.

g8oz 4 days ago

This is awesome. Disappointing to hear about the Cloudflare fetch issue.

doctorpangloss 4 days ago

The infallibility of Cloudflare is sacrosanct!
motorest 4 days ago

> Disappointing to hear about the Cloudflare fetch issue.
You mean the one where explicitly configuring Cloudflare to forward requests to origin servers as HTTP will actually send requests as HTTP? That is not what I would describe as disappointing.
- g8oz 4 days ago
  
  The behavior seems likely to mislead a lot of people even if it doesn't confuse you.
  - motorest 4 days ago
    
    > The behavior seems likely to mislead a lot of people even if it doesn't confuse you.
    You need to go way out of your way to toggle a switch to enable this feature.
    The toggle says very prominently "Cloudflare allows HTTPS connections between your visitor and Cloudflare, but all connections between Cloudflare and your origin are made through HTTP."
    You proceed to enable this feature.
    Does it confuse you that Cloudflare's requests to your origin servers are HTTP?