Google flags Immich sites as dangerous

266 points by janpio 5 hours ago

arccy 3 hours ago

If you're going to host user content on subdomains, then you should probably have your site on the Public Suffix List https://publicsuffix.org/list/ . That should eventually make its way into various services so they know that a tainted subdomain doesn't taint the entire site....

CaptainOfCoit an hour ago

I think it's somewhat tribal webdev knowledge that if you host user generated content you need to be on the PSL otherwise you'll eventually end up where Immich is now.
I'm not sure how people not already having hit this very issue before is supposed to know about it beforehand though, one of those things that you don't really come across until you're hit by it.
- hu3 an hour ago
  
  This is the first time I hear about https://publicsuffix.org
  - btown an hour ago
    
    You're in good company! From 12 days ago: https://news.ycombinator.com/item?id=45538760
- no_wizard an hour ago
  
  I’ve been doing this for at least 15 years and it’s the first I heard of this.
  Fun learning new things so often but I never once heard of the public suffix list.
  That said, I do know the other best practices mentioned elsewhere
- tonyhart7 41 minutes ago
  
  so its skill issue ??? or just google being bad????
  - yndoendo 4 minutes ago
    
    I will go with Google being bad / evil for 500.
    Google 90s to 2010 is nothings like Google 2025. There is a reason they removed "Don't be evil" ... being evil and authoritarian makes more money.

0xbadcafebee 18 minutes ago

  In the past, browsers used an algorithm which only denied setting wide-ranging cookies for top-level domains with no dots (e.g. com or org). However, this did not work for top-level domains where only third-level registrations are allowed (e.g. co.uk). In these cases, websites could set a cookie for .co.uk which would be passed onto every website registered under co.uk.

  Since there was and remains no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list. This is the aim of the Public Suffix List.
  
  (https://publicsuffix.org/learn/)

So, once they realized web browsers are all inherently flawed, their solution was to maintain a static list of websites.

God I hate the web. The engineering equivalent of a car made of duct tape.

lukan 4 minutes ago

"The engineering equivalent of a car made of duct tape"
Kind of. But do you have a better proposition?

827a 25 minutes ago

They aren't hosting user content; it was their pull request preview domains that was triggering it.
This is very clearly just bad code from Google.
LennyHenrysNuts 38 minutes ago

The root cause is bad behaviour by google. This is merely a workaround.
- bitpush 35 minutes ago
  
  Remember, this is a free service that Google is offering for even their competitors to use.
  And it is incredibly valuable thing. You might not think it is, but internet is filled utterly dangerous, scammy, phisy, malwary websites and everyday Safe Browsing (via Chrome, Firefox and Safari - yes, Safari uses Safe Browsing) keeps users safe.
  If immich didnt follow best practice that's Google's fault? You're showing your naivety, and bias here.
  - NetMageSCW 28 minutes ago
    
    Please point me to where GoDaddy or any other hosting site mentions public suffix, or where Apple or Google or Mozilla have a listing hosting best practices that include avoiding false positives by Safe Browsing…
  - liquid_thyme 28 minutes ago
    
    >You might not think it is, but internet is filled utterly dangerous, scammy, phisy, malwary websites
    Google is happy to take their money and show scammy ads. Google ads are the most common vector for fake software support scams. Most people google something like "microsoft support" and end up there. Has Google ever banned their own ad domains?
    Google is the last entity I would trust to be neutral here.
  - delis-thumbs-7e 14 minutes ago
    
    Oh c’mon. Google does not offer free services. Everyone should know that by now.
aftbit an hour ago

I thought this story would be about some malicious PR that convinced their CI to build a page featuring phishing, malware, porn, etc. It looks like Google is simply flagging their legit, self-created Preview builds as being phishing, and banning the entire domain. Getting immich.cloud on the PSL is probably the right thing to do for other reasons, and may decrease the blast radius here.
o11c 2 hours ago

Is that actually relevant when only images are user content?
Normally I see the PSL in context of e.g. cookies or user-supplied forms.
- dspillett 43 minutes ago
  
  > Is that actually relevant when only images are user content?
  Yes. For instance in circumstances exactly as described in the thread you are commenting in now and the article it refers to.
  Services like google's bad site warning system may use it to indicate that it shouldn't consider a whole domain harmful if it considers a small number of its subdomains to be so, where otherwise they would. It is no guarantee, of course.
ggm an hour ago

I think this only is true if you host independent entities. If you simply construct deep names about yourself with demonstrable chain of authority back, I don't think the PSL wants to know. Otherwise there is no hierarchy the dots are just convenience strings and it's a flat namespace the size of the PSLs length.
r_lee an hour ago

Does Google use this for Safe Browsing though?
- akerl_ an hour ago
  
  Looks like it? https://developers.google.com/safe-browsing/reference/URLs.a...
fukka42 32 minutes ago

This is not about user content, but about their own preview environments! Google decided their preview environments were impersonating... Something? And decided to block the entire domain.
andrewstuart2 2 hours ago

Aw. I saw Jothan Frakes and briefly thought my favorite Starfleet first officer's actor had gotten into writing software later in life.

NelsonMinar 3 hours ago

Be sure to see the team's whole list of Cursed Knowledge. https://immich.app/cursed-knowledge

levkk 2 hours ago

The Postgres query parameters one is funny. 65k parameters is not enough for you?!
- reliabilityguy 41 minutes ago
  
  > PostgreSQL USER is cursed > The USER keyword in PostgreSQL is cursed because you can select from it like a table, which leads to confusion if you have a table name user as well.
  is even funnier :D
- strken 2 hours ago
  
  As it says, bulk inserts with large datasets can fail. Inserting a few thousand rows into a table with 30 columns will hit the limit. You might run into this if you were synchronising data between systems or running big batch jobs.
  Sqlite used to have a limit of 999 query parameters, which was much easier to hit. It's now a roomy 32k.
  - tym0 an hour ago
    
    Right, for postgres I would use unnest for inserting a non-static amount of rows.
  - evertedsphere 2 hours ago
    
    COPY is often a usable alternative.

heavyset_go an hour ago

Insane that one company can dictate what websites you're allowed to visit. Telling you what apps you can run wasn't far enough.

liquid_thyme 22 minutes ago

I really don't know how they got nerds to think scummy advertising is cool. If you think about it, the thing they make money on - no user actually wants ads or wants to see them, ever. Somehow Google has some sort of nerd cult that people think its cool to join such an unethical company.
- jazzyjackson 7 minutes ago
  
  Turns out it's cool to make lots of money

jdsully 40 minutes ago

The one thing I never understood about these warnings is how they don't run afoul of libel laws. They are directly calling you a scammer and "attacker". The same for Microsoft with their unknown executables.

They used to be more generic saying "We don't know if its safe" but now they are quite assertive at stating you are indeed an attacker.

kevinsundar 3 hours ago

This may not be a huge issue depending on mitigating controls but are they saying that anyone can submit a PR (containing anything) to Immich, tag the pr with `preview` and have the contents of that PR hosted on https://pr-<num>.preview.internal.immich.cloud?

Doesn't that effectively let anyone host anything there?

daemonologist 3 hours ago

I think only collaborators can add labels on github, so not quite. Does seem a bit hazardous though (you could submit a legit PR, get the label, and then commit whatever you want?).
- ajross 2 hours ago
  
  Exposure also extends not just to the owner of the PR but anyone with write access to the branch from which it was submitted. GitHub pushes are ssh-authenticated and often automated in many workflows.
warkdarrior 3 hours ago

Excellent idea for cost-free phishing.

gtirloni 4 minutes ago

There's a reason GitHub use github.io for user content.

trollbridge 3 hours ago

A friend / client of mine used some kind of WordPress type of hosting service with a simple redirect. The host got on the bad sites list.

This also polluted their own domain, even when the redirect was removed, and had the odd side effect that Google would no longer accept email from them. We requested a review and passed it, but the email blacklist appears to be permanent. (I already checked and there are no spam problems with the domain.)

We registered a new domain. Google’s behaviour here incidentally just incentivises bulk registering throwaway domains, which doesn’t make anything any better.

donmcronald 2 hours ago

Wow. That scares me. I've been using my own domain that got (wrongly) blacklisted this week for 25 years and can't imagine having email impacted.

your_challenger 10 minutes ago

Them maintaining a page of gotchas is a really cool idea - https://immich.app/cursed-knowledge

meander_water a few seconds ago

> There is a user in the JavaScript community who goes around adding "backwards compatibility" to projects. They do this by adding 50 extra package dependencies to your project, which are maintained by them.
This is a spicy one, would love to know more.

akerl_ 2 hours ago

Tangential to the flagging issue, but is there any documentation on how Immich is doing the PR site generation feature? That seems pretty cool, and I'd be curious to learn more.

kyrofa a minute ago

Pretty sure Immich is on github, so I assume they have a workflow for it, but gitlab has first-class support for this which I've been using for years: https://docs.gitlab.com/ci/review_apps/
TheDong 41 minutes ago

It's open source, you can find this trivially yourself in less than a minute.
https://github.com/immich-app/devtools/tree/a9257b33b5fb2d30...
- akerl_ 24 minutes ago
  
  Wow. What a rude way to answer.

Animats 3 hours ago

If you block those internal subdomains from search with robots.txt, does Google still whine?

snailmailman 2 hours ago

I’ve heard anecdotes of people using an entirely internal domain like “plex.example.com” even if it’s never exposed to the public internet, google might flag it as impersonating plex. Google will sometimes block it based only on name, if they think the name is impersonating another service.
Its unclear exactly what conditions cause a site to get blocked by safe browsing. My nextcloud.something.tld domain has never been flagged, but I’ve seen support threads of other people having issues and the domain name is the best guess.
- donmcronald 2 hours ago
  
  I'm almost positive GMail scanning messages is one cause. My domain got put on the list for a URL that would have been unknowable to anyone but GMail and my sister who I invited to a shared Immich album. It was a URL like this that got emailed directly to 1 person:
  https://photos.example.com/albums/xxxxxxxx-xxxx-xxxx-xxxx-xx...
  Then suddenly the domain is banned even though there was never a way to discover that URL besides GMail scanning messages. In my case, the server is public so my siblings can access it, but there's nothing stopping Google from banning domains for internal sites that show up in emails they wrongly classify as phishing.
  Think of how Google and Microsoft destroyed self hosted email with their spam filters. Now imagine that happening to all self hosted services via abuse of the safe browsing block lists.
  - r_lee an hour ago
    
    if it was just the domain, remember that there is a Cert Transparency log for all TLS certs issued nowadays by valid CAs, which is probably what Google is also using to discover new active domains
  - beala an hour ago
    
    It doesn’t seem like email scanning is necessary to explain this. It appears that simply having a “bad” subdomain can trigger this. Obviously this heuristic isn’t working well, but you can see the naive logic of it: anything with the subdomain “apple” might be trying to impersonate Apple, so let’s flag it. This has happened to me on internal domains on my home network that I've exposed to no one. This also has been reported at the jellyfin project: https://github.com/jellyfin/jellyfin-web/issues/4076
  - EdwardKrayer an hour ago
    
    Well, that's potentially horrifying. I would love for someone to attempt this in as controlled of a manner as possible. I would assume it's possible for anyone using Google DNS servers to also trigger some type of metadata inspection resulting in this type of situation as well.
    Also - when you say banned, you're speaking of the "red screen of death" right? Not a broader ban from the domain using Google Workplace services, yeah?
  - im3w1l an hour ago
    
    Chrome sends visited urls to Google (ymmv depending on settings and consents you have given)

captnasia 3 hours ago

This seems related to another hosting site that got caught out by this recently:

https://news.ycombinator.com/item?id=45538760

o11c 2 hours ago

Not quite the same (other than being an abuse of the same monopoly) since this one is explicitly pointing to first-party content, not user content.

jakub_g 2 hours ago

Regarding how Google safe browsing actually works under the hood, here is a good writeup from Chromium team:

https://blog.chromium.org/2021/07/m92-faster-and-more-effici...

Not sure if this is exactly the scenario from the discussed article but it's interesting to understand it nonetheless.

TL;DR the browser regularly downloads a dump of color profile fingerprints of known bad websites. Then when you load whatever website, it calculates the color profile fingerprint of it as well, and looks for matches.

(This could be outdated and there are probably many other signals.)

ggm an hour ago

Is there any linkage to the semifactoid that immich Web gui looks very like Google Photos or is that just one of the coincidences?

russelg 38 minutes ago

Not a coincidence, Immich was started as a personal replacement for Google Photos.

donmcronald 5 hours ago

I tried to submit this, but the direct link here is probably better than the Reddit thread I linked to:

https://old.reddit.com/r/immich/comments/1oby8fq/immich_is_a...

I had my personal domain I use for self-hosting flagged. I've had the domain for 25 years and it's never had a hint of spam, phishing, or even unintentional issues like compromised sites / services.

It's impossible to know what Google's black box is doing, but, in my case, I suspect my flagging was the result of failing to use a large email provider. I use MXRoute for locally hosted services and network devices because they do a better job of giving me simple, hard limits for sending accounts. That way if anything I have ever gets compromised, the damage in terms of spam will be limited to (ex) 10 messages every 24h.

I invited my sister to a shared Immich album a couple days ago, so I'm guessing that GMail scanned the email notifying her, used the contents + some kind of not-google-or-microsoft sender penalty, and flagged the message as potential spam or phishing. From there, I'd assume the linked domain gets pushed into another system that eventually decides they should blacklist the whole domain.

The thing that really pisses me off is that I just received an email in reply to my request for review and the whole thing is a gas-lighting extravaganza. Google systems indicate your domain no longer contains harmful links or downloads. Keep yourself safe in the future by blah blah blah blah.

Umm. No! It's actually Google's crappy, non-deterministic, careless detection that's flagging my legitimate resources as malicious. Then I have to spend my time running it down and double checking everything before submitting a request to have the false positive mistake on Google's end fixed.

Convince me that Google won't abuse this to make self hosting unbearable.

akerl_ 2 hours ago

> I suspect my flagging was the result of failing to use a large email provider.
This seems like the flagging was a result of the same login page detection that the Immich blog post is referencing? What makes you think it's tied to self-hosted email?
foobarian 2 hours ago

Wonder if there would be any way to redress this in small claims court.

jstrong an hour ago

google: we make going to the DMV look delightful by comparison!

elphinstone an hour ago

They are not the government and should not have this vast, unaccountable monopoly power with no accountability and no customer service.
- stonogo 35 minutes ago
  
  the government probably shouldn't either?

renewiltord 2 hours ago

I think the other very interesting thing in the reddit thread[0] for this is that if you do well-known-domain.yourdomain.tld then you're likely to get whacked by this too. It makes sense I guess. Lots of people are probably clicking gmail.shady.info and getting phished.

0: https://old.reddit.com/r/immich/comments/1oby8fq/immich_is_a...

donmcronald 2 hours ago

So we can't use photos or immich or images or pics as a sub-domain, but anything nondescript will be considered obfuscated and malicious. Awesome!

7363288236973 2 hours ago

[dead]

nautilus12 3 hours ago

[flagged]

ocdtrekkie 3 hours ago

As someone who doesn't like Google and absolutely thinks they need to be broken up, no probably not. Google's algorithms around security are so incompetent and useless that stupidity is far more likely than malice here.
- o11c 2 hours ago
  
  Incompetently or "coincidentally" abusing your monopoly in a way that "happens" to suppress competitors (while whitelisting your own sites) probably won't fly in court. Unless you buy the judge of course.
  Intent does not always matter to the law ... and if a C&D is sent, doesn't that imply that intent is subsequently present?
  Defamation laws could also apply independently of monopoly laws.
- dare944 2 hours ago
  
  Callous disregard for the wellbeing of others is not stupidity, especially when demonstrated by a company ostensibly full of very intelligent people. This behavior - in particular, implementing an overly eager mechanism for damaging the reputation of other people - is simply malicious.