Skip to main content

· One min read
Anthony Scaffidi

After building our own MetalLB, CNPG and Prometheus operator charts, we've also now finished the work on building our own Cert-Manager operator chart. As of today this chart will be a requirement for new users if they want to use Cert-Manager and required for all users starting August 1, 2023.

If you have already installed clusterissuer follow the below guidance for installation of the Cert-Manager operator chart.

If you have not already done so add the operator train to TrueCharts as outlined here

  1. Run this in the system shell as root:
    k3s kubectl delete --grace-period 30 --v=4 -k https://github.com/truecharts/manifests/delete4
  2. Install cert-manager from the operators train.
  3. Update clusterissuer to the latest version of (2.0.1+).
  • If you are already on the latest version perform an empty edit of clusterissuer (Edit app and save without making any changes).

If you run into additional issues, please file a ticket with our dedicated support staff via the #support channel of our discord as normal.

· 2 min read
Anthony Scaffidi

As part of limiting our promise not to introduce breaking changes to the charts within our Enterprise train, we've ensured both the new and old way of dealing with "operators" were both supported.

Starting August 1, 2023, we will completely drop support for the old (pre-July installs only, internal not user controlled) way of handling operators.

After August 1, 2023, additional checks for operators will be enabled, preventing users from making the mistake of installing charts without the right operator from the operator train present. This means that charts will prevent themselves from being updated when you're still using the old operators at that time.

If you have already installed the metallb, prometheus-operator, and cloudnative-pg operators then no further action is required.

Prerequisites

Add the operator train to TrueCharts as outlined here

MetalLB

The MetalLB operator is only required for users of MetalLB, anyone who does not use or plan to use MetalLB can skip this section.

  1. Uninstall current metallb from Enterprise train.
  2. Run this in the system shell as root: k3s kubectl delete --grace-period 30 --v=4 -k https://github.com/truecharts/manifests/delete
  3. Complete MetalLB installation as outlined here

Prometheus

The Prometheus operator is required for the use of app metrics. Its installation is recommended.

  1. Run this in the system shell as root: k3s kubectl delete --grace-period 30 --v=4 -k https://github.com/truecharts/manifests/delete3
  2. Install prometheus-operator from the operators train.

CNPG

The cloudnative-pg operator is required for any applications that utilize postgres. Its installation is recommended.

  1. Follow the CNPG Operator Migration Guide to migrate to the new CNPG operator. Ensure you follow the guide carefully as data loss can occur with this migration if proper steps are not followed.

If you run into additional issues, please file a ticket with our dedicated support staff via the #support channel of our discord as normal.

· 2 min read
Kjeld Schouten-Lebbing

After building our own MetalLB operator chart, we've also now finished the work on building our own CloudNative-PG Chart. As of today this chart will be a requirement for new users if they want to run applications featuring a postgresql database.

Updating to the new Cloudnative-PG helm chart for existing users

We want to point out though, that users should update the new CloudNative-PG Helm chart as soon as possible. To update an existing install with applications using postgresql databases to the new system, the following procedure can be used:

We want to explicitly highlight that this procedure will COMPLETELY DESTROY all your databases. It's absolutely crucial to export your databases manually beforehand.

  • export all your databases manually, on SCALE using the following guide: https://truecharts.org/manual/SCALE/guides/cnpg-migration-guide (do not rely on heavyscript backups for this!)
  • run this in a root shell: k3s kubectl delete --grace-period 30 --v=4 -k https://github.com/truecharts/manifests/delete2
  • install the new cloudnative-pg chart from the operators train
  • wait a few minutes
  • Hit edit and save without changes on all applications using postgresql databases.
  • wait a few minutes
  • Restore all your databases manually, on SCALE using the following guide: https://truecharts.org/manual/SCALE/guides/cnpg-migration-guide (do not rely on heavyscript backups for this!)

If you run into additional issues, please file a ticket with our dedicated support staff via the #support channel of our discord as normal.

· 3 min read
Kjeld Schouten-Lebbing

Introdocution: Our own Operator Charts

The last few months, we've experimented with injecting so-called "operators" into the cluster directly when using our charts. Manifests for things like: MetalLB, Cert-Manager and CNPG where always loaded. While this system guaranteed users where always running the latest operator versions, we've also encountered some downsides. Primarily:

  • Loading manifests from the web is a security issue
  • Loading manifests required a pre-install job, with full-cluster permissions. Which is also a security issue.
  • Mistakes in the manifests, directly affect all users regardless of version
  • It requires creating namespaces outside of the ix-something style, while not an issue that's something somehow iX developers voiced annoyance with.
  • It lacks any configurability for users that need a customisation
  • It prevents users from using these operators outside of the TrueCharts scope on non-scale systems

To fix all of these issues, we've had quite the challenge. First off we needed to figure out a way of preventing users from installing multiple instances of the same operator. But we also needed to ensure ourselves that users always had the correct operators installed for the charts they want to install.

We've by now designed an industry leading helm logic, that scans your cluster for references of installed operators and compares those to the required operators.

Besides this logic, we also need to write the Helm Charts ourselves. This is a lot of work, as operators are often notoriosly complex to write helm charts for. Luckily we've enough experienced Kubernetes developers that we're certain to pull this off!

First chart: MetalLB

As a first example of our new logic, we're super happy to introduce our first self-build operator helm chart: MetalLB. It will be completely self-contained within it's own namespace, not load dyanamic manifests from the web and doesn't contain risky security practices.

Obviously this chart, in the operators train, has a naming conflict with the old metallb chart in the enterprise train, so the later has been renamed to metallb-config requiring a reinstall. We want to point out that only the new metallb-config chart is compatbile with the new self-build metallb operator.

We are very happy to also announce that the metallb-config chart, is fully compatible with our old and new ways of installing/managing metallb. However, new installs of the old way of handling metallb (without the chart from the operators train), will be actively disabled from now on.

To use MetalLB on new installs, one needs to install both metallb and metallb-config, in that order.

Updating to the new MetalLB helm chart

We want to point out though, that users should update the new MetalLB Helm chart as soon as possible. To update a current install using MetalLB to the new system, the following procedure can be used:

  • remove the old metallb chart coming from the enterprise train
  • run this in a root shell: k3s kubectl delete --grace-period 30 --v=4 -k https://github.com/truecharts/manifests/delete
  • install the new metallb chart from the operators train
  • wait a few minutes
  • install or update metallb-config to the latest version
  • wait a few minutes
  • Hit edit on metallb-config and save without changes if you where already on the latest version or it isn't working yet
  • wait a few minutes

If you run into additional issues, please file a ticket with our dedicated support staff via the #support channel of our discord as normal.

· 2 min read
Kjeld Schouten-Lebbing

BLUF: Traefik (Stable) is Deprecated. Users need to add the Enterprise channel and install Traefik (Enterprise). https://truecharts.org/manual/SCALE/guides/getting-started#adding-truecharts

The use of TrueNAS Scale Certificates is also deprecated and you must migrate to Clusterissuer (Enterprise). https://truecharts.org/charts/enterprise/clusterissuer/how-to (note: Clusterissuer replaced Cert-Manager)

As some of you might've noticed, Traefik has been a bit outdated the last few weeks. The reason behind this, was a multitude of potentially breaking todo's where left and we don't want to bother users with continues manual intervention on breaking changes. By now we've fixed the remaining issues and will soon release a breaking-change release for traefik and a patch for all the charts.

In short we've ensured that we only use our signature "tc-system" namespace for storing configuration and middlewares for traefik. This ensures consistent behavior for users using ingressClasses and allowed us to, cleanly, fix the known bug where a port got appended to the TrueNAS SCALE "portal" button.

This also means that charts that do not get patches because they are not ported to new common, most notably: Nextcloud Will inherently not work anymore. Though, users would've been ill-adviced using it at all currently... due to the big ongoing nextcloud rework.

Unrelated new issue

We also got the request from iX-systems staff a while ago to limit our use of non-ix-prefixed namespaces on kubernetes. While the other work to do so, requires a lot of work building our own operator helm-charts, these Traefik changes are the initial step to comply to those wishes. The "low hanging fruit".

As we're working hard on building seperate operator helm-charts, instead of handling it in the background.This work leads to a unrelated temporary issue, which has been created on purpose: CNPG will currently only be installed on new systems, if one of our "enterprise" charts is being installed. More news about this will be released later.

For any help, please file a ticket with our dedicated support staff via the #support channel of our discord as normal.

· 4 min read
Kjeld Schouten-Lebbing

Previously we've warned users against using the stop-botton on TrueNAS SCALE. At the same time we also understand, that users expect platform uniformity between Helm and SCALE. That's why we're happy to announce our own solution stop our Charts: TrueCharts Stop-All!

About that stop button

First off, we would like to go into a bit more depth about the design issues of the TrueNAS SCALE "Stop" Button. We've hinted at it previously, but it's always good to explain why we need to step in ourselves.

The idea with Kubernetes, is that one tool should be managing deployment of objects at a time. Often indicated by a managed-by annotation on said objects. With TrueNAS SCALE, the middleware, triggers a management tool called "Helm" to deploy objects. So far so good, a GUI isn't magically able to trigger other software, after all.

However, with the stop button, iX decided to also start editing some of those objects themselves. Specifically "Deployments" and "StatefullSets", setting them to 0 "replica's" meaning "run nothing". That sounds completely fine, however: In these cases "Helm" is the actuall management tool for those objects, so everything a helm action is triggered, those modifications are instandly removed.

That's where the problems start to become bigger and bigger, because helm actions are triggered more often than people realise. For example: A reboot also triggers helm, requiring the same "hacks" to put things "back to sleep" again.

iX also decided to not even include all default objects that are technically "running", like: Daemonsets, Jobs and Cronjobs. Which leads to issues with breaking jobs or daemonsets/jobs locking access to PVC's. There it becomes more complicated, as kubernetes does not only exists of those "default" objects. There are also "Custom Resources", objects that are defined by other charts and there is also the ability of other management tools, like Operators, to add objects.

When making such changes through Helm, it would be relatively easy for Chart/App developers to mitigate this. However, iX decided not to and does not even expose the "stop" button state to the Chart/App developers, leaving us completely without tools to mitigate these design flaws.

In the end, that leave out how the stop button can get into a near endless state of limbo, if not all running objects are stopped correctly... Putting the cherry on top.

Looking for a better way forward

With that all concluded, we decided to look into "what needs to be done", to get all our Charts to have "stop" button functionality back again. It's clear that the stop button, even with little fixes, isn't going to be a future proof design. It completely needs to be redesigned, including all it's backend logic. Sadly enough, refactors of said scale (pun not even intended), are currently not the priority of iXsystems, so not something we can rely on for our users.

We concluded that the only way to do so reliably, is through Helm itself. We know which objects we have, how they need to be stopped and can do so reliably through Helm. Which means: Do it ourselves!

The solution: TrueCharts Stop-All

With the most recent updates, we've introduced a new option: TrueCharts Stop-All This option will cause all your running objects to slowly shut themselves down or, in the case of our postgresql backend (CNPG), "hibernate".

It's designed to feature support for all default kubernetes objects deployed using our common chart: Daemonsets, Deployments, StatefullSets, Jobs and Cronjobs. On top we can easily expand that to cover any operator based objects, like cnpg, that needs to be shut down as-well in the future!

How To Use Stop-All

On SCALE

On SCALE this is a little checkbox on editing the App. Check it and its done

NOTE: Do not forget to uncheck the "Stop-All" checkbox to start the App again.

Using Helm

On native Helm, the same functionality is also availably: Simply set the following in your values.yaml file:

global
stopAll: true

· 3 min read
Kjeld Schouten-Lebbing

We're glad to finally announce the end of our code-freeze. Since a few days we've re-enabled our automatic updates and within a few weeks everything should balance out again automatically!

At the same time, we've not completely finished porting all stable-train charts to the new common, 65 are still missing. But we've clearly label those updates as breaking in the changelog when they come in. Most of those are charts that have more complications than anticipated, so need a little quality time with our maintainers which takes a while.

Known Issues

Now that we're mostly done, we also need to report a few known issues with the new backend:

  1. DO NOT USE THE STOP BUTTON

The Stop button should not be used on any TrueNAS SCALE Apps that uses postgresql. It due to severe design mistakes by iXsystems, it will get into an endless loop and never finish. We're reported the issue to iXsystems and they are not interested in fixing this.

  1. Postgresql breaking on reboot

We're seen some edgecases where the new database backend breaks after a reboot. Often after the STOP button was used, though we cannot trace the issue down back to the use of the stop button itself. These issues are reported to the folks over at CNPG and we've also thrown them an email to discuss weither we can fund them to fix these issues

  1. hostNetworking changes

After much R&D, our staff have discovered quite a few nasty kubernetes-level bugs with hostNetworking. As a result, we've decided to never enable it by default anymore on any of our charts/apps, as we cannot guarantee its stability. For some charts that, often, require this setting (like tailscale), users would have to manually and explicitly enable it from now on.

The setting has also moved in the GUI.

  1. Deprecated certificate system and you

With most Charts ported, we want to highlight the fact that the "TrueNAS SCALE (Deprecated)" certificate option, should not be used anymore. We cannot guarantee it's stability nor can do anything at-all to help out. It will also be removed as an option in the future, though that will be months rather than weeks.

The future

With the charts slowly all being ported, we can start working on our long-term plans again. One of those plans is a renewed focus on native Helm Charts.

For May and June, we're planning to go all-in on improving documentation for use of our charts as normal Helm charts. At the same time we're going to work on ensuring all our SCALE specific tricks (of which only a few are left, luckily), will have automatic alternatives for normal kubernetes clusters.

To highlight this, we've asked Artifact hub, to highlight our Common-Library chart, as an "official" TrueCharts Helm chart. All users of helm should be able to use the power of this advanced common-library, to build the Helm Charts they please... Without even relying on TrueCharts to host their charts for them!

Check it out here: https://artifacthub.io/packages/helm/truecharts-library-charts/common And also check out the docs as always: https://truecharts.org/manual/helm/common/

· One min read
Kjeld Schouten-Lebbing

While most of our migration to new common worked out reasonably well, we've recieved many issues with regards to another change. Our change for the "Arr" Apps, like Radarr and Prowlarr, to their new Postgresql backend ended up terribly.

We did not correctly anticipate how hard that migration was going to be for our users and also encountered a number of bugs and design mistakes for those Apps. After long consideration and attempted bugfixing, we've decided to revert the move to Postgreqsql for the "Arr" Apps, back to sqllite.

This also means that after next change (which will be flagged as breaking due to moving back the database change) you will also be able to neatly import your "Arr" App backups from old common again.

We've very sorry for this revert and we completely understand that we should've done considerably more research before implementing this move to a different database version. The revert should be made available shortly, within 24 hours.

· 4 min read
Kjeld Schouten-Lebbing

We're close to releasing releasing the breaking port of another 50+ of our "Stable" train charts to the new common train. With this, we want to look back on a few things we've noticed with the initial release:

Breaking Changes

Generally speaking, any change in the first semver digit of our versions, means a potentially breaking changes. How much this affects you, usually is effected by both the updates and your personal setup. In this specific case, we want to make extra clear that 99.9% of our SCALE Apps will require manual reinstall.

For SCALE: This also means anything in databases is going to be completely wiped unless you've HeavyScript/TrueTool backups and/or have followed one or more of our community migration guides. We should've been more clear that this behavior includes any and all databases and is not limited to MariaDB. Sadly enough this "wipe on App deletion" is a design in TrueNAS SCALE and not something we have influence over.

Our Helm users would, in most cases, with adapting their current values(.yaml) file in accordance with the new structure. though databases will still get wiped when doing the update.

GPU Support

GPU support took two snags:

  • One was an obscure SCALE bug where dicts with one item didn't get rendered in the GUI (and it's output) accordingly. We've created a temporary patch for this to compensate
  • The other was a minor permission issue, namely an additional group that should've been passed that got lost in development-translation from old to new common

Both are by now resolved and (being) rolled out. In the future we plan to prevent at least the first issue more thoroughly by manually checking if the interface behaves correctly when doing big GUI changes.

Addons

We're still having some issues getting the Addons, primarily the VPN addon, behaving correctly. Mostly this is due to significantly increased hardening of our default kubernetes deployment. We expect this to be fixed within a week or two, in the mean time users depending on our charts being used with VPN might want to wait a little.

Discord

There is some annoyance over the fact we use Discord for support. We're aware of this and are actually contemplating moving to another platform (as well). Sadly enough we do not have unlimited time available to work on the new common, release a new branding style and expand support to another platform. Users can expect a Discord alternative either end of 2023 or somewhere in 2024.

Verbal Abuse

A much less okey subject is the fact multiple of our staff members have suffered verbal abuse of varying degrees. Some have even led to cases where platform (reddit, discord etc.) needed to step in to take action. While sometimes a staff response might seem a tad blunt or not to your liking, some of the things we've seen are completely and utterly unacceptable. We've a head moderator, JagrBombs if you've any issue with a staff member.

We've taken steps to prevent needlessly exposing our staff to this. One of which is limiting our presence within certain communitie on an as-needed basis.

Conclusion

In the end we've gotten a lot of feedback on the new release. Understandably many users are/where upset a reinstall was required. We want to highlight that we understand the frustration, but with the scope of these changes, a complete rewrite of our Common backend, we didn't have much choice on SCALE. It's important to note, that users on SCALE cannot update via the update button in almost all cases, so users do not have to worry about magically losing data by using the update button for this release.

Another topic we've seen mentioned was "but they say they are production ready", we want to be completely clear about this: TrueCharts is not production ready at this time. In the future, after a seperate announcement, only our "Enterprise" train will be considered "production ready". We want to highlight that this does not mean "stable" users can expect these breaking changes more often, as we don't plan to put another 700+ hours into the common chart any time soon. But it does mean, users should NEVER depend on our stable train for production, unless they do so on their own risk.

We wish all our users the best in going through these migrations and our support staff is available on Discord if you need any help.

· 2 min read
Kjeld Schouten-Lebbing

Hope everyone had an amazing easter, we know we had a busy one to say the least!

We are excited to announce that we have completed porting the first 222 charts in our "stable" train to our new "common" library chart. This chart serves as the basis for all of our apps and charts, and we believe that it will provide a more stable and reliable foundation for all of our future work.

While there are still over 160 charts left to be ported in our stable train, we expect to complete this work before the end of the month. To ensure that we have sufficient time to complete this work, we are extending our code freeze for the stable train until May 1, 2023. After this date, we guarantee that we will resume our normal update schedule.

In addition, we want to make it clear that we have lifted the code freeze for our "Enterprise" and "Dependency" charts, and will continue to provide updates for these charts on a regular basis.

It is important to note that this update is considered "most likely breaking," and will likely wipe all databases used in charts. We also anticipate that there may be some regressions, which is why we encourage users to file bug reports or contact our support staff if they experience any issues.

We would like to take this opportunity to thank our community for their patience and understanding as we work to improve our platform. We believe that these updates will provide significant benefits in terms of stability, reliability, and functionality, and we look forward to sharing them with our users in the coming weeks and months.

As always, we welcome any feedback or suggestions from our users, and we remain committed to providing the best possible experience for everyone who uses our platform. Thank you again for your support, and we look forward to continuing to work with you in the future.