Can someone in GitHub senior leadership please start paying attention and reprioritise towards actually delivering a product that's at least relatively reliable?
I moved my company over to GH enterprise last year (from AzDO) and I'm considering moving us away to another vendor altogether as a result of the constant partial outages. Things that used to "just work" now are slow in the UI, and GH actions fail to schedule in a reasonable timeframe way more than they ever used to. I enjoy GH copilot as much as the next person, but ultimately I came to GH because I needed a git forge, and I will leave GH if the git forge doesn't work.
I second this. GitHub used to be a fantastic product. Now it barely even works. Even basic functionality like the timeline updating when I push commits is unreliable. The other day I opened a PR diff (not even a particularly large one) and it took fully 15 seconds after the page visually finished loading -- on a $2,000 dev machine -- before any UI elements became clickable. This happened repeatedly.
It is fairly stunning to me that we've come to accept this level of non-functional software as normal.
The trend of "non-functional software" is happening everywhere. See the recent articles about Copilot in Notepad, failing to start because you aren't signed in with your Microsoft Account.
Not quite everywhere. There's a common denominator for all of those: Microsoft.
Their business is buying good products and turning them into shit, while wringing every cent they can out of the business. Always has been.
They have a grace period of about 2-4 years after acquisition where interference is minimal. Then it ramps up. How long a product can survive once the interference begins largely depends on how good senior leadership at that product company is at resisting the interference. It's a hopeless battle, the best you can do is to lose slowly.
At my first company we used Skype to communicate with each other. Mostly chats and files.
One day our internet cable to the office got cut by someone. Well, we didn't realize that for some time, because Skype just continued to work without Internet. It was like a miracle. It was unique software, there's nothing like it even today.
I think that the first thing Microsoft did after they bought Skype is making Internet mandatory, probably to spy on all chats.
Heh, I was working at 2 of those gaming companies when they were acquired by m$. I almost fear taking another job in the gaming industry, there seems to be some kind of bastardised version of Murphy's law that any gaming company that hires me will be acquired by ms 6 months later.
I mean, that's obviously not the case, but it's weird that it happened twice!
Very weird it happened twice! But that's a kind of a cool factoid to tell people haha.
Even with devs and publishers that don't die or are killed, they still lay hundreds off when a game is done. Then the studio limps along in pre-production mode on their next game for 4-5 years it seems like...
Maybe the only job stability in the industry is with indies, and... Nintendo?
I'd add the hugely successful studios to that list. Even after ms acquisition, to the best of my knowledge neither of the 2 studios I worked at had any layoffs.
But they boast the most sold video game in the history of videogames (Tetris a close-ish second), and most downloaded free mobile game, respectively. Each have player bases larger than the population of the country they're from!
> But they boast the most sold video game in the history of videogames (Tetris a close-ish second), and most downloaded free mobile game, respectively.
Just out of curiosity, I guessed Minecraft which tracks, and Subway Surfers respectively, rather than Candy Crush Saga. Is CCS actually the most downloaded free mobile game ever?
I for one am shocked--SHOCKED, I say!--to learn that anything bad could happen as a result of a) putting everything in "the cloud" and b) handing control over the entire world's source code to the likes of Microsoft.
Who could have POSSIBLY foreseen any kind of dire consequences?
Nobody. Nobody at all could have seen it. Microsoft is cool now, haven't you seen VSCode? They do Open Source, they run Linux, they've joined the fold, the tiger shed its stripes.
You're obviously being sarcastic, but for the longest time the dominant position of a huge chunk of HN (and the tech world in general) has been that the cloud is the answer to any problem, and that anything deviating from it is either impossible, too expensive or too stupid.
After a generation of indoctrinated people, Microsoft (or any FAANG really) can't even afford to do anything differently.
Sci-Fi Author: In my book I invented the Torment Nexus as a cautionary tale.
Tech Company: At long last, we have created the Torment Nexus from classic sci-fi novel Don't Create The Torment Nexus.
Evil incarnate and the next president of the United States you've never heard of. Vance is his sock puppet, he was chosen because he is guaranteed not to have a single independent thought so when Trump croaks Thiel will be the president in all but name.
It was also he who willed OpenAI to be in order to help destroying American democracy.
Fascism can only thrive in an alternate reality and LLMs are excellent at producing such propaganda on an industrial scale. Accordingly, the political right uses it for that purpose much more and conservatives are much more receptive to it, too.
This thread has complaints about software coming from the same supplier both degrading.
The person(s) who wanted this want Azure to get bigger and have prioritized Azure over Windows and Office, and their share price has been growing handsomely.
‘Microslop’, perhaps, but their other nickname has a $ in it for a reason.
Let’s just say there are a couple of guys, who are up to no good. And they started making trouble in our neighborhood.
jokes aside it’s all because of hyper financial engineering. Every dollar every little cent must be maximized. Every process must be exploited and monetized, and there are a small group of people who are essentially driving all this all across the world in every industry.
It was a complete accident. Nobody could have foreseen it. We are currently experiencing the sudden discovery that Microsoft is an evil corporation and maybe putting everything in the cloud wasn't the best move after all.
Hey from the GitHub team. Outages like this are incredibly painful and we'll share a post-mortem once our investigation is complete.
It stings to have this happen as we're putting a lot of effort specifically into the core product, growing teams like Actions and increasing performance-focused initiatives on key areas like pull requests where we're already making solid progress[1]. Would love if you would reach out to me in DM around the perf issues you mentioned with diffs.
There's a lot of architecture, scaling, and performance work that we're prioritizing as we work to meet the growing code demand.
We're still investigating today's outage and we'll share a write up on our status page, and in our February Availability Report, with details on root cause and steps we're taking to mitigate moving forward.
Literally everyone who has used Github to look at a pull request in say the last year has experienced the ridiculous performance issues. It's a constant laughing point on HN at this point. There is no way you don't know this. Inviting to take this to a private channel, along with the rest of your comment really, is simply standard corporate PR.
Yes agreed it's been a huge problem, and we shipped changes last week to address some of the gnarly p99 interactions. It doesn't fix everything and large PRs have a lot of room to be faster. It's still good to know where some worst performance issues are to see if there's anything particularly problematic or if a future change will help.
FWIW, I find the new React-based diff viewer worse than the old server-rendered page. I disabled the preview for this reason. It does have some nice features but overall it feels more finicky. I would think that in theory this should be better at handling large diffs but I'm not sure that that's the case, and at least the UX feels more choppy.
That's financialization at play. When you render and syntax highlight the diff on the server, Github pays the cost, if you do it on the client side, the cost is paid by the client. At Github's scale it's probably a large enough of a difference that they decided the reduced customer experience is worth it.
I have been using GitHub since 2011 and it's undeniable that the performance of the website have been getting worse. The new features that are constantly being added are certainly a factor, but I think the switch to client-side rendering that obviously shifted the load from their server to our browsers and also tend to produce ridiculously large and inefficient DOMs[1] is the main cause.
If you want a practical example, here you go. I'm a Nixpkgs commiter and every time I make a pull request that backports some change to the stable branch, GitHub unprompted starts comparing my PR against master. If I'm not fast enough to switch the target branch within a couple of seconds it literally freezes the browser tab and I may have to force quit it. Yes, the diff is large, but this is not acceptable, and more importantly, it didn't happen a few years ago.
It's insulting to see the word "progress" being used when the PR experience is orders of magnitude slower than it was years ago, when everyone had way worse computers. I have a maxed M5 MacBook and sometimes I can barely review some PRs.
Hopefully the published postmortem will announce that all features will be frozen for the foreseeable future and every last employee will be focused on reliability and uptime?
I don’t think GitHub cares about reliability if it does anything less than that.
I know people have other problems with Google, but they do actually have incredibly high uptime. This policy was frequently applied to entire orgs or divisions of the company if they had one outage too many.
For what it's worth, I doubt that people think it's the engineering teams that are the problem; it feels as though leadership just doesn't give a crap about it, because, after all, if you have a captive audience you can do whatever you want.
(See also: Windows, Internet Explorer, ActiveX, etc. for how that turned out)
It's great that you're working on improving the product, but the (maybe cynical) view that I've heard more than anything is that when faced with the choice of improving the core product that everyone wants and needs or adding functionality to the core product that no one wants or needs and which is actively making the product worse (e.g. PR slop), management is too focused on the latter.
What GitHub needs is a leader who is willing and able to say no to the forces enshittifying the product with crap like Copilot, but GitHub has become a subsidiary of Copilot instead and that doesn't bode well.
>I doubt that people think it's the engineering teams that are the problem
Did you forget Microsoft engineering response to Casey Muratori "Extremely slow performance when processing virtual terminal sequences"?
"I believe what you’re doing is describing something that might be considered an entire doctoral research project in performant terminal emulation as “extremely simple” somewhat combatively."
Ya, it really was one of the most enjoyable web apps to use pre-MS. I'm sure there are lots of things that have contributed to this downfall. We certainly didn't need bullshit features like achievements.
Even just a year or two ago its web interface was way snappier. Now an issue with a non-trivial number of comments, or a PR with a diff of even just a few hundred or thousand lines of changes causes my browser to lock up.
This is just microsoft doing the only thing they know, which is taking a good product and turning it into a monster by bashing out whatever feature is on some investors mind that barely even work in a isolated vacuum-sealed test chamber. All microsoft producs are like bad experiments.
So React rewrite did not help after all? Imagine, one of the largest software tool companies on Earth cannot reliably REbuild something in React. I lost count of the inconsistency issues React introduced.
The new design/architecture allows them to do great stuff in the name of efficiency; for example, when browsing through some parts of the UI, it's now much more capable of just updating the part of the page that's changed, rather than having to reload the entire thing. This is a significantly better approach for a lot of things.
I understand that the 'updating the part of the page that's changed' functionality is now dramatically slower, more unresponsive, and less reliable than the 'reload the entire thing' approach was, and it feels like browsing the site via Citrix over dial-up half the time, but look, sacrifices have to be made in the name of making things better even if the sacrifice is that things get worse instead.
> to do great stuff in the name of efficiency; for example, when browsing through some parts of the UI, it's now much more capable of just updating the part of the page that's changed
Are you implying that this only doable with React? I mean just for the fun of it you can look at this video:
GitHub used jQuery + pjax to do exactly this a decade ago - rendered HTML for smaller components was fetched and replaced in-place with a single DOM update. It even had fancy sliding transitions.
> for example, when browsing through some parts of the UI
React allows this? I didn't realize that I needed React to do this when we used Java and Js to do this 20 years ago. I also didn't realize I needed React to do this when we used Scala and generated Js to do this 10 years ago. JFC, the world didn't start when you turned 18.
> I understand that the 'updating the part of the page that's changed' functionality is now dramatically slower, more unresponsive, and less reliable than the 'reload the entire thing' approach was, and it feels like browsing the site via Citrix over dial-up half the time, but look, sacrifices have to be made in the name of making things better even if the sacrifice is that things get worse instead.
Was this a local/on prem version of GL or the hosted web version?
My previous org had an on prem version hosted on a local VM. It was extremely fast, we setup another VM for the runners, and one for storing all the docker containers. The thing I’ve seen people do it use the VM they put their gitlab instance on for everything and ends up bogging things down quite a bit.
We loved Github as a product when it needed to return or profit beyond "getting more users".
I feel this is just the natural trajectory for any VC-funded "service" that isn't actually profitable at the time you adopt it. Of course it's going to change for the worse to become profitable.
Moving to client-side rendering via React means less server load spent generating boilerplate HTML over and over again.
If you have a captive audience, you can get away with making the product shittier because it's so difficult for anyone to move away from it - both from an engineering standpoint and from network effects.
It seems most of the complaints are about the reliability and infrastructure - which is very much often a direct result of lack of investment and development resources.
And then many UI changes people have been complaining about are related to things like copilot being forcibly integrated - which is very much in the "Microsoft expect to gain a profit by encouraging it's use" camp.
It's pretty rare companies make a UI because they want a bad UI, it's normally a second order thing from other priorities - such as promoting other services or encouraging more ad impressions or similar.
I mean.. it’s a Microsoft product now. That’s basically a guarantee it will suck, and continue to get worse and worse until it’s an unusable mess of garbage like everything else they make. I haven’t seen any good user-facing windows products in at least 10 years and somehow the bar drops lower by the year.
In testing for my workflows copilot significantly underperforms the SOTA agents, even when using the exact same models. It's not particularly close either.
This has lead to 2 classes of devs at my company a) AI hesitant, who for many copilot is their only interaction, having their worst fears confirmed about how bad AI is. b) AI enthusiasts who are irritated by dealing with management that don't know the difference pushing back on their asks for access to SOTA agents.
If I were the frontier labs, and wasn't billions of dollars beholden to Microsoft, I'd cut Copilot off. It poisons the well for adoption of their other systems. I don't deal with the other copilots besides the coding agent variants but I hear similar things about the business application variants.
Microsofts AI reputation is in the toilet right now, I'm not sure if its understood how bad it really is within the org.
Interesting - these head to head comparisons you’re doing with the same model - what harnesses are you comparing, say Claude code / codex versus copilot cli?
> I'm not sure if its understood how bad it really is within the org.
I can’t speak to that, but there’s a lively culture of people using internal tooling who also extensively use 3p products on projects outside work and are in a reasonable position to assess how well GH copilot works.
Yeah, I’m only interested in cli and non-interactive agent usage. I don’t compare say the vs code plugins, but do regularly compare say GitHub code reviews.
Those comparisons for instance have made us turn _off_ copilot pull requests entirely. All of the agents have false positives (as do humans) but copilot was having negative value in that context.
I’ve only started using it, so maybe I’m holding it wrong, but the other day I asked the IntelliJ plugin to explained two lines of code by referencing the line numbers. It printed & explained two entirely different lines in a different part of the file. I asked again. It picked two lines somewhere else.
After using ChatGPT for the last 6 months or so, Copilot feels like a significant downgrade. On the other hand, it did easily diagnose a build failure I was having, so it’s not useless, just not as helpful.
Sure I love Claude Code too - I use it plenty outside of work. But funnily enough I’ve been asking myself about whether to get my org on board with internal Claude Code trials and was struggling to truly articulate what we were losing versus the Copilot cli. There are some feature gaps - but the pace of work is super and experience is pretty good for me.
No one should hit Microsoft over the head for giving people access to Claude code - choice and competition is good!
Github used to publish some pretty interesting postmortems. Maybe they still do. IIRC that they were struggling with scaling their SQL db and were starting to hit the limits. It's a tough position to be in because you have to either to a massive migration to a data layer with much different semantics, or you have to keep desperately squeezing performance and skirting on the edge of outages with a DB that wasn't really meant to handle what you're doing with it now.
The OpenAI blog post on "scaling" Postgres to their current scale has much the same flavor, although I think they're doing it better than Github appears to be doing.
I’d be surprised by this: GitHub pretty famously used Vitess, and I’d be surprised if each shard were too big for modern hardware. Based on previous reporting [0], they’re running out of space in the main data center and new management is determined to move to Azure in a hurry. I’d bet that these outages are a combination of a worsening capacity crunch in the old data center and…well, Azure.
> Can someone in GitHub senior leadership please start paying attention and reprioritise towards actually delivering a product that's at least relatively reliable?
It's Microsoft. A reliable product is not a reasonable expectation.
The ultimate irony is that Linus Thorvalds designed git with the Linux kernel codebase in mind to work without any form of infrastructure centralisation. No repo trumps any other.
Surely some of your crazy kids can rummage up a CI pipeline on their laptop? 8)
Anyway, I only use GH as something to sync interesting stuff from, so it doesn't get lost.
Not going to happen. This is terminal decline. Next step is to kill off free repos, and then they'll start ratcheting up the price to the point that they have one small dedicated engineering team supporting each customer they have. They will have exactly one customer. At some point they'll end up owned by Broadcom, OpenText, Rocket, or Progress.
Killing off free repos is not going to happen. That would be a suicide move on the level of the Digg redesign, or Tumblr's porn ban.
It kind of would be good for everyone if they did do it though. Need to get rid of this monopoly, and maybe people will discover that there are alternatives with actually good workflows out there.
My favourite restriction is the fact that colored text doesn't work in dark mode. Why? Because whatever intern they had implement dark mode didn't understand how CSS works, and just slapped !important on all the style changes that make dark mode dark, and thus overwrite the color data.
I ended up writing a browser extension for my team to fix it, because the boss loved to indicate stuff with red/green text.
When I last had the misfortune of using devops copy and pasting text from a work item would take the background colour with it, though not the text colour.
Colleages in dark mode would rearrange some sentences, and I'd be left with black text on almost black background.
That's going to depend on each user's demands. The PR message limit is the biggest pain for me. I don't depend on the UI very often. I'm not trying to do any CI/CD nonsense. I just use it as a bog standard git repo. When used as that, it works just fine for me
So I work for a devtools vendor (Snyk) and 6 months ago I signed into Azure DevOps for the first time in my life
I couldn't believe it. I actually thought the product was broken. Just from a visual perspective it looked like a student project. And then I got to _using_ the damn thing
It's also completely unloved. Even MSFT Azure's own documentation regularly treats it as a second class citizen to GitHub. I have no idea why they don't just deprecate the service and officially feature freeze it.
Honestly that's the case with a lot of Azure services though.
Someone mentioned the boards but Pipelines/Actions are not 100% compliant.
My company uses Azure DevOps for a few things and any attempt to convert to GitHub was quickly abandoned after we spent 3 hours trying to get some Action working.
However, all usability quarks aside, I actually prefer these days since Microsoft doesn't really touch it and it just sits in corner doing what I need.
> Can someone in GitHub senior leadership please start paying attention and reprioritise towards actually delivering a product that's at least relatively reliable?
They claim that is what they are doing right now. [1]
Zero indication that migrating to azure will improve stability over the colos they are in now. The outages aren’t caused by the datacenter, whatever MS execs say.
You might as well self-host at this point as that is far more reliable than depending on GitHub.
Additionally, there is no CEO of GitHub this time that is going to save us here.
So as I said many years ago [0] in the long term, a better way is to self host or use alternatives such as Codeberg or GitLab which at least you can self host your own.
What's interesting about GitHub outages is how they've become a forcing function for teams to re-examine their deployment pipeline resilience.
We've gotten so used to GitHub being "always there" that many teams have zero fallback. CI/CD stops. Deploys halt. Hotfixes can't ship. During an active incident on your own systems, that's brutal.
A few things I've seen teams do after getting burned:
1. Mirror critical repos to a secondary git host (GitLab, self-hosted Gitea)
2. Cache dependencies aggressively so builds don't fail on external fetches
3. Have a manual deploy runbook that doesn't require GitHub Actions
The status page being hours behind reality is a separate frustration. I've started treating official status pages as "eventually consistent" at best — by the time they update, Twitter/X and internal monitoring have usually told me what I need to know.
GitLab is the solution, if you aren't on it already.
I worked for one of Australia largest airline company, monthly meeting with Github team resumed in one word: AI
There is zero focus into the actual platform as we knew it, it is all AI, Copilot, more AI and more Copilot.
If you are expecting things to get better, I have bad news for you.
Copilot is not being adopted by companies as they hoped, they are using Claude themselves.
If Microsoft ever rollback, boy oh boy, things will get ugly.
GitLab is no improvement over github, their features are frequently half-baked, their site is slow, and outages are just as common.
I used to like Gitlab, and I've self-hosted enterprise versions of both github and gitlab, and strongly believe migration from one of them to the other for "improved reliability" will be utterly underwhelming and pointless.
Gitlab used to be able to take the high-ground due to the open-core model, but these days I'm not even sure if that makes an appreciable difference.
To the best of my knowledge, Copilot is a Microsof in-house thing and it sucks on everything.
Claude is far superior and Microsoft is allegedly using Claude internally over its own AI solution.
Github Copilot is just a front-end. You pay for the frontend and some premium requests every month.
The base models like GPT 4o, 4.1 don’t have a usage cap. Models like Claude Sonet, Opus, etc have a monthly limit and you can pay more to use these through Github Copilot.
Yes, Copilot is often used to refer to both frontend and backend.
Also, the chatbot aka Copilot must have features on its own.
Like I say to Roo Code/Cline on VSCode: Write a python script to output hello world.
And they will do exactly that, Roo Code is pretty impressive.
Copilot on its own is dogshit and when using it the same way I would use Perplexity AI or ChatGPT, it is twice as much dogshit.
The problem is that each feature has been slightly more half-baked than the last one. The SecOps stuff is full of gotchas which don't exist. Troubleshooting a pipeline behaving correctly is extremely painful.
The other problem is that if you want a feature you have to upgrade the seat license for everyone :(
End users are being screwed over left and right, you better host your own code.
GitHub, GitLab only adds a GUI for git.
Enterprise helm will pay if that means no interruption, no AI being pushed everywhere. Some companies adopt GitLab because you can self host it, even the runners are self-hosted, there is no built-in runner like GitHub.
I wonder if GitHub is feeling the crush of fully automated development workflows? Must be a crazy number of commits now to personal repos that will never convert to paid orgs.
IME this all started after MSFT acquired GitHub but well before vibe coding took the world by storm.
ETA: Tangentially, private repos became free under Microsoft ownership in 2019. If they hadn't done that, they could've extracted $4 per month from every vibe coder forever(!)
An anecdote: On one project, I use a skill + custom cli to assist getting PRs through a sometimes long and winding CI process. `/babysit-pr`
This includes regular checks on CI checks using `gh`. My skill / cli are broken right now:
`gh pr checks 8174 --repo [repo] 2>&1)`
Error: Exit code 1
Non-200 OK status code: 429 Too Many Requests
Body:
{
"message": "This endpoint is temporarily being throttled. Please try again later. For more on scraping GitHub and how it may affect your rights, please review our Terms of Service (https://docs.github.com/en/site-policy/github-terms/github-terms-of-service)",
"documentation_url": "https://docs.github.com/graphql/using-the-rest-api/rate-limits-for-the-rest-api",
"status": "429"
}
I simply do not believe that all of these people can and want to setup a CI. Some maybe, but even after the agent will recommend it only a fraction of people would actually do it. Why would they?
But if you setup CI, you can pick up the mobile site with your phone, chat with Copilot about a feature, then ask it to open a PR, let CI run, iterate a couple of times, then merge the PR.
All the while you're playing a wordle and reading the news on the morning commute.
It's actually a good workflow for silly throw away stuff.
No its not. 121M repos added on github in 2025, and overall they have 630 million now. There is probably at best 2x increased in output (mostly trash output), but no where near 100x
> It may not be 100x as was told to me but it's definitely putting the strain on the entire org.
But thats not even the top 5 strain on github, their main issue is the forced adoption of Azure. I can guarantee you that about 99% of repos are still cold, as in very few pulls and no pushes and that hasn't changed in 3 months. Storage itself doesn't add that much strain on the system if the data is accessed rarely.
I put the blame squarely on GitHub and refuse to believe it’s a vendors fault. It’s their fault. They may be forced to use Azure but that doesn’t stop one from being able to deliver a service.
I’ve done platforms on AWS, Azure, and GCP. The blame is not on the cloud provider unless everyone is down.
I was wondering about that the other day, the sheer amount of code, repos, and commits being generated now with AI. And probably more large datasets as well.
I still say that mixing CI/CD with code/version control hosting is a mistake.
At it's absolute best, everything just works silently, and you now have vendor lock-in with whichever proprietary system you chose.
Switching git hosting providers should be as easy as changing your remotes and pushing. Though now a days that requires finding solutions for the MR/PR process, and the wiki, and all the extra things your team might have grown to rely on. As always, the bundle is a trap.
I mean, not necessarily proprietary right? There are OSS solutions like forgejo that make it pretty simple, at least as simple as running a git system and a standalone CI system
i mean that is certainly better, but I still don’t like having them coupled. Webhooks were a great idea, and everyone seems to have forgotten about them.
At this point, GitHub outages feel closer to cloud provider outages than a SaaS blip. Curious how many people here still run self-hosted Git (GitLab / Gitea) vs fully outsourcing version control.
My previous two startups used GitLab successfully. The smaller startup used paid-tier hosted by gitlab.com. The bigger startup (with strategic cutting-edge IP, and multinational security sensitivity) used the expensive on-prem enterprise GitLab.
(The latter startup, I spent some principal engineer political capital to move us to GitLab, after our software team was crippled by the Microsoft Azure-branded thing that non-software people had purchased by default. It helped that GitLab had a testimonial from Nvidia, since we were also in the AI hardware space.)
If you prefer to use fully open source, or have $0 budget, there's also Forgejo (forked from Gitea). I'm using it for my current one-person side-startup, and it's mostly as good as GitLab for Git, issues, boards, and wiki. The "scoped" issue labels, which I use heavily, are standard in Foregejo, but paid-tier in GitLab. I haven't yet exercised the CI features.
I just checked out Forgejo. I think i start with it, looks clean and lightweight. For my homelab I don’t have very large requirements. Might be a good starting point for me.
I was just looking into this today but it seems pricey. $29/user/month for basic features like codeowners and defining pr approval requirements. Going with Forgejo.
Wait, what? So you're on the hook for backups, upgrades, etc. and you have to pay them for the privilege? I thought GitLab was free as in speech and beer.
I consider moving away from Github, but I need a solid CI solution, and ideally a container registry as well. Would totally pay for a solution that just works. Any good recommendations?
We can run a Forgejo instance for you with Firecracker VM runners on bare metal. We can also support it and provide an SLA. We're running it internally and it is very solid. We're running the runners on bare metal, with a whole lot of large CI/CD jobs (mostly Rust compilation).
The down side is that the starting price is kinda high, so the math probably only works out if you also have a number of other workloads to run on the same cluster. Or if you need to run a really huge Forgejo server!
I suspect my comment history will provide the best details and overview of what we do. We'll be offering the Firecracker runner back to the Forgejo community very soon in any case.
I actually went through some of the issue/pr stuff for the forgejo project after I asked you. It seems like things are moving along nicely and you seem to have found a welcoming environment in their repo. I will keep an eye on that progress. Thanks very much. I do not have a pressing need but firecracker runners would be pretty awesome to have.
Long time GitLab fan myself. The platform itself is quite solid, and GitLab CI is extremely straightforward but allows for a lot of complexity if you need it. They have registries as well, though admittedly the permission stuff around them is a bit wonky. But it definitely works and integrates nicely when you use everything all in one!
Should our repos be responsible for CI in the first place? Seems like we keep losing the idea of simple tools to do specific jobs well (unix-like) and keep growing tools to be larger while attempting to do more things much less well (microsoft-like).
I think most large platforms eventually split the tools out because you indeed can get MUCH better CI/CD, ticket management, documentation, etc from dedicated platforms for each. However when you're just starting out the cognitive overhead and cost of signing up and connecting multiple services is a lot higher than using all the tools bundled (initially for free) with your repo.
It would be interesting to have a graph showing AI adoption in coding against the number of weekly outages across different companies. I am sure they are quite correlated.
> It would be interesting to have a graph showing AI adoption in coding against the number of weekly outages across different companies. I am sure they are quite correlated.
Probably a stronger correlation to the fact that vibe-coding has resulted in millions of new repos being created, with automatic CIs being triggered by agents continuously sending PRs for those projects.
Exactly! Also operating "at scale" is only impressive if you can do it with comparable speed and uptime, it doesn't mean much if every page takes seconds to load and it falls over multiple times a day lol
I'm starting to wonder if people doing what were previously unconventional workflows (which may not be performance optimized) are affecting things.
For example, today, I had claude basically prune all merged branches from a repo that's had 8 years of commits in it. It found and deleted 420 branches that were merged but not deleted.
Deleting 420 branches at once is probably the kind of long tail workflow that was not worth optimizing in the past, right? But I'm sure devs are doing this sort of housekeeping often now, whereas in the past, we just never would've made the time to do so.
This is exactly why my employer is unlikely to adopt Azure. When CoreAI assets like GitHub appear poorly managed, it undermines confidence in the rest of the ecosystem. It’s unfortunate, because Microsoft seems to overlook how strongly consumer experience shapes business perception. Once trust is damaged, no amount of advertising spend can fully restore it.
They dont care. Their sales reps absolutely know that if you are using Microsoft products it is because you are locked in so deeply that escape is nearly impossible.
You should reach the same conclusion by trying to use it for this purpose, but also indeed for any purpose at all. Incidents that make you unable to deploy making all your CD efforts pointless are only the cherry on top.
I moved everything on github to a self hosted foregjo instanse some days ago. I really did not do anything. Created some tokens so that CC could access github and forgejo and my dns API. Self hosting is so much simpler and easier with AI. Expect more people to self host small to medium stuff.
Ironic that that same AI you're mentioning is probably a large part of why this class of outages are increasing. Id highly recommend folks understand their infrastructure enough to setup/run it without AI before they put anything critical on it.
Sure. I can agree with that. At the same time, the reason people aren't doing it is not solely a skill issue. It's also a matter of time, energy, and what you want to prioritise.
I believe I have good enough control over it to fix issues that may arise. But then again, CC will probably do it faster. I will most likely not need to fix my own issues, but if needed, I think I will be able to.
"Critical" plays an important role in what you're saying. The true core of any business is something you should have good control over. You should also accept that less important parts are OK for AI to handle.
I think the non-critical part is a larger part than most people think.
We are lagging behind in understanding what AI can handle for us.
I'm an optimistic grey beard, even if the writing makes me sound like a naive youth :)
My company just migrated to GitHub, and it's been a shockingly bad experience. BitBucket never felt like anything more than a tool that did the job, but now I really miss it.
Someone needs to make an mcp server for my claude so it can check if services are down, it goes stir crazy when github is down and adds heaps of work around code =D
Remember the other day when a bunch of yous were making fun of zig moving away from GitHub?
Now suddenly you all say this is not the future you wanted.
Everyday you opt in to get wrecked by Microsoft.
You all do realize you all could, for a change, learn something and never again touch anything Microsoft related?
> You all do realize you all could, for a change, learn something and never again touch anything Microsoft related?
I learned that lesson in the 90s and became an "ABM" (Anything But Microsoft).
People sadly shall never learn: Windows 12 is going to come out and shall suck more than any previous version of Windows except Windows 11, so they'll see it as progress. Then Windows 13 is going to be an abysmal piece of crap and people shall hang to their Windows 12, wondering how it's possible that Microsoft came out with a bad OS.
There are still people explaining, today, that Microsoft ain't all bad because Windows XP was good (for some definition of good). Windows XP came out in late 2001.
Windows XP was a disaster. Resistance to adopting it was industry wide, and lasted years. It was only with Service Pack 2 (which was an enormous rewrite of basically the whole system) that they started to turn the ship around.
This is the predictable outcome of subordinating the GitHub product to the overarching "AI must be part of everything whether it makes sense or not" mandate coming down from the top. It was only a year ago that GitHub was moved under the "CoreAI" group at Microsoft, and there's been plenty of stories of massive cost-cutting and forcing teams to focus on AI workflows instead of their actual product priorities. To the extent they are drinking their own Kool-Aid, this sort of ops failure is also an entirely predictable outcome of too much reliance on LLM-generated code and workflows rather than human expertise, something we see happening at an alarming scale in a number of public MS repos.
Hopefully it will get bad enough fast enough that they'll recognize they need to drastically change how they are operating. But I fear we're just witnessing a slow slide into complacency and settling for being a substandard product with monopoly-power name recognition.
The irony of githubstatus.com itself being hosted on a third-party (Atlassian Statuspage) is not lost on anyone who works in incident management. Your status page being up while your product is down is table stakes, not a feature.
What's more interesting to me is the pattern: second major outage in the same day, and the status page showed "All Systems Operational" for a good chunk of the first one. The gap between when users notice something is broken and when the status page reflects it keeps growing. That's a monitoring and alerting problem, not just an infrastructure one.
At some point the conversation needs to shift from "GitHub is down again" to "why are so many engineering orgs single-threaded on a platform they don't control and can't observe independently?" Git is distributed by design. Our dependency on a centralized UI layer around it is a choice we keep making.
> The irony of githubstatus.com itself being hosted on a third-party (Atlassian Statuspage) is not lost on anyone who works in incident management. Your status page being up while your product is down is table stakes, not a feature
That's WHY it's hosted externally, so that if GitHub goes down the status page doesn't.
I moved my company over to GH enterprise last year (from AzDO) and I'm considering moving us away to another vendor altogether as a result of the constant partial outages. Things that used to "just work" now are slow in the UI, and GH actions fail to schedule in a reasonable timeframe way more than they ever used to. I enjoy GH copilot as much as the next person, but ultimately I came to GH because I needed a git forge, and I will leave GH if the git forge doesn't work.
reply