Caramel: An OCaml for the Erlang VM

cultofmetatron · on March 5, 2021

2 years ago, I went all in with elixir/phoenix with my startup. Happy to see a language like this which appears to handle integration with the beam ecosystem as a first class consideration. I can see this being useful for some parts of the app where I need to integrate with external services.

praveenperera · on March 5, 2021

If you’re looking for a typed BEAM language with great interop with Elixir and other BEAM languages, have you seen Gleam?

https://github.com/gleam-lang/gleam

sagolikasoppor · on March 6, 2021

How did it work out for you?

cultofmetatron · on March 6, 2021

there's been a few challenges related to lack of libraries and the occasional dependency hell but overall it has been our secret weapon.

We can knock out features very fast and the api server has NEVER been the bottleneck. performance is a dream. Elixir has been refreshingly low on hype and strong on delivery.

We were able to scale using Oban to run background workers. The BEAM makes runnign background workers so easy we can add background processes without needing to setup any external orchestration. This lets me focus on building features instead of puttng together micro-serverices or orchestration tooling

Channels let us consider realtime from the get go. Since push functionality over web-sockets is so cheap resoruces wise, we use it everywhere. Our scaling strategy when the database is he bottleneck is to insert data as a job in oban job and send the result out over a channel. This would be an orchestration nightmare in production with any other system.

WE have a few other challenges (which I can't go into) where elixir makes it possible for a small team to do things that would be impossible in other languages without vastly more resources. Suffice to say, the ability to create independent processes under a supervision tree both via configuration and dynamically via the registry allow us to create scalable topologies that would be a crushing level of complexity to manage without the solid shoulders that the beam stands on.

Basically, if realtime is a component of your startup, phoenix does not dissapoint.

may4m · on March 5, 2021

I was looking at elixir and F# for my next project. I liked F# for the type system and elixir for OTP/Beam. This looks like something I wanted

bmitc · on March 5, 2021

You might also be interested in Gleam.

https://gleam.run/

Is it possible to share any details about your project? I'm just curious since I also use F# and am learning Elixir.

devmunchies · on March 5, 2021

Does gleam let me destructure a list so that I can bind the head and body each to a variable?

e.g. in f#

    let splitHeadAndTail = function
      | h :: [] -> printfn "got head and EMPTY tail"
      | h :: t -> printfn "got head and tail"
      | [] -> printfn "got empty list"

The example in their docs does a 1:1 map and doesn't show if you can bind the whole tail[1]

[1]: https://gleam.run/book/tour/case-expressions.html#destructur...

derefr · on March 5, 2021

Yes; see https://hexdocs.pm/gleam_stdlib/gleam/list/. The syntax is [head, ..tail]

It'd be kind of silly to create a greenfield BEAM language that doesn't expose one of the most fundamental operations you can do in BEAM bytecode, and that every list-processing function ultimately relies upon :)

devmunchies · on March 5, 2021

> It'd be kind of silly to have a BEAM language that doesn't expose one of the most primitive operations you can do in the BEAM, that every list-processing function ultimately relies upon :)

Agree, which is why i was confused it wasn't in the tour I linked to in the Lists or Destructuring sections

centimeter · on March 5, 2021

Why not just use Haskell? You don't get OTP/Beam, but the GHC runtime provides almost all the same functionality (plus a lot more), to the point where the Cloud Haskell library is basically a full clone of Erlang's actor functionality.

nine_k · on March 6, 2021

Does it provide all of the OTP? Can you write in the "let it fail" style and be certain that the failing parts will be restarted, etc?

Also, BEAM is somehow more battle-tested.

leostera · on March 6, 2021

Hola folks, I'm making Caramel. AMA.

There's usually updates on Twitter too here: https://twitter.com/CaramelLang

akavel · on March 6, 2021

In the installation guides, please mention at least in passing that it needs Erlang to be installed as a prerequisite. Being a casual reader, I wasn't sure about that until I found https://caramel.run/manual/getting-started/first-steps.html#.... Notably, the "it's just a single binary" might be highly misleading in this context for people not coming from the BEAM users community.

Also, it would be super cool if I could easily find a list of limitations/differences vs. conventional OCaml language & runtime. And a hello-world snippet on the front page (or possibly some other tiny one showing some cool BEAM feature), it took me a while to find the "examples" folder in the github repo.

Please note those are just some tiny complaints rooted in the fact that I absolutely love what you did here! :)

leostera · on March 6, 2021

Good points! I've made 3 issues to make sure we get this done.

https://github.com/AbstractMachinesLab/caramel/issues/73 https://github.com/AbstractMachinesLab/caramel/issues/74 https://github.com/AbstractMachinesLab/caramel/issues/75

Thanks a lot <3 - you're welcome in our Discord: http://discord.caramel.run

pharmakom · on March 5, 2021

The creator of Erlang said that static typing was not appropriate for the applications Erlang is designed for. Worth keeping this in mind!

StreamBright · on March 5, 2021

I don't think he ever said such a thing. Do you have the source of this claim?

Some back story:

"Through the years, there were some attempts to build type systems on top of Erlang. One such attempt happened back in 1997, conducted by Simon Marlow, one of the lead developers of the Glasgow Haskell Compiler, and Philip Wadler, who worked on Haskell's design and has contributed to the theory behind monads (Read the paper on said type system). Joe Armstrong later commented on the paper:

One day Phil phoned me up and announced that a) Erlang needed a type system, b) he had written a small prototype of a type system and c) he had a one year’s sabbatical and was going to write a type system for Erlang and “were we interested?” Answer —“Yes.”

Phil Wadler and Simon Marlow worked on a type system for over a year and the results were published in [20]. The results of the project were somewhat disappointing. To start with, only a subset of the language was type-checkable, the major omission being the lack of process types and of type checking inter-process messages."

https://learnyousomeerlang.com/types-or-lack-thereof

pharmakom · on March 5, 2021

sorry I can't remember where I read it.

one quote I found:

Q: "Erlang is currently a dynamically typed language. Are you saying you would like some static type as optional typing or both, or would you change the nature of Erlang to static typing?"

A: "No, I wouldn't change it, but I would like subdomains where it is statically typed. You could actually encapsulate parts in statically typed."

So it sounds like he stands by dynamic as the correct choice for Erlang, although not all of the time.

Twisol · on March 5, 2021

And I think that speaks to the current state of type system research on communicating processes, not to any fundamental inability thereof. Session types are a recent approach (though one that I don't think goes far enough; give me full-duplex channels!)

jasone · on March 5, 2021

Erlang APIs are designed around the availability of dynamic pattern matching on messages, but static message typing is definitely feasible. The only feature of Erlang that stands out me as fundamentally incompatible with static typing is "live upgrade". And although this feature might be critical in the context for which it was designed (Ericcson telecom switches), it appears that very few Erlang-based deployments use live upgrade; the most common practice is to restart services.

toast0 · on March 5, 2021

I'm not that up on static typing, but it would seem to me that static typing would be difficult with different versioned nodes on dist as well as hot upgrade.

While a lot of users avoid hot load and restart/replace nodes instead, I don't think many restart/replace the whole cluster.

bjz_ · on March 6, 2021

Yeah, would really love to see a static type system that tackled this directly - ie. handling versioned nodes in a cluster and ensuring deployments happen safety. I think it would be possible, but it would require some careful thought and design.

dnautics · on March 5, 2021

> different versioned nodes

Exactly. Blue-green deploys could get seriously messed up if the static typing system doesn't strongly think about this.

carlmr · on March 6, 2021

I'm not too familiar with Erlang. Doesn't dynamic typing just mean you have to strongly think about it?

toast0 · on March 6, 2021

If you follow the philosophy of Erlang, you can take shortcuts to strongly thinking about it.

On the server side: what happens if a process geta a message it doesn't handle? Either it will crash (and be restarted by supervision, hopefully), it will receive and discard it (possibly logging it), or it will ignore it and leave it in the message queue (not great for long term memory use).

On the client side: what happens if the process sends a message that is ignored or crashes the receiver? The client will timeout, and won't know if the work was done or not; best to bubble up the error, perhaps by crashing (and maybe restarting by supervision).

If you're ok with these 'worst cases', you don't have to proactively deep think. The proper deployment strategy is clear: deploy servers that handle new messages first, then deploy clients that send new messages, but if you mess the order, you'll be within your failure model. It gets more tricky when there's a difference in data types and a piece of data could be used by old code, then new code, then old code, of course.

Of course, the actual worst cases are more fun, but less likely to happen. You could find a bug that crashes BEAM or something; I don't think that would be easily found by sending a unexpected message, but maybe (depends on what you do with unexpected messages).

But if your static typing just says yeah I dunno, it could crash, I don't think you've gained much from the exercise. I'm probably biased against static typing, but if it can model the system and provide useful insights that are accurate on the system, that sounds helpful; if it can't model the system though...

macintux · on March 6, 2021

You're absolutely right: with Erlang you have to spend a great deal of effort planning for, implementing, and testing code that can work correctly across nodes with different versions of the code and/or versions of Erlang itself.

I think the open question is whether strong, _static_ type system can handle it at all.

dnautics · on March 7, 2021

you actually don't. You just have to use pattern matching correctly.

macintux · on March 7, 2021

Well, yes, that's the primary tool, but from my now-fuzzy memories working on Riak at Basho, I recall there being subtle and not-so-subtle risks and complexities to manage.

chemeng · on March 6, 2021

I believe the specifics were in regards to hot code reloading. Robert Virding (one of the creators of Erlang). This may be the specific quote you are remembering.

"I just want to confirm what has already been mentioned a number of times in this thread: the reason for not having static typing in Erlang was the absolute need for doing dynamic code upgrades of running systems. This was an absolute requirement and if we couldn’t do it then our language/system wasn’t interesting. There aren’t many systems using it today as most try to do rolling upgrades but we didn’t have that option back then so we had to be able to upgrade a running system. Remember we are talking late 80’s early 90’s here."

There is more to his comment and the thread that's worth reading:

https://elixirforum.com/t/how-hard-would-it-be-to-have-a-str...

lf-non · on March 6, 2021

There has been a lot of research and implementation work around gradual typing since erlang was initially conceived.

Using a type-system as a linter (with a few escape hatches whenever dynamic behavior is needed and without any impact on runtime) is still a great step forward as opposed to no types at all.

jimbokun · on March 5, 2021

Knowing nothing else about this project, it falls in the category of things about which I thought "Wouldn't it be cool if somebody..."

chromatin · on March 6, 2021

This looks very cool, and I have been casually looking into OCaml lately.

Please excuse this naïve question, but what is the main value proposition here over plain OCaml: Erlang/BEAM library ecosystem interop, or more something relating to high concurrency? It wasn’t really clear to me from the website/docs.

jaegerpicker · on March 5, 2021

I loved working with the Beam VM and Elixir. I just wish it had a stronger presence in the Data/ML/AI fields. As scalable and trivially parallizable as Elixir is with a fast SciPy/NumPy/Pandas type of eco-system and you'd have a fantastic environment for data heavy apps IMO.

jbott · on March 5, 2021

Have you seen nx[1]? It looks like the first step towards this!

1. https://dashbit.co/blog/nx-numerical-elixir-is-now-publicly-...

dsiegel2275 · on March 5, 2021

Have you seen the other three replies in this thread?

a_bored_husky · on March 5, 2021

Have you seen project Nx?

https://github.com/elixir-nx/nx

rmetzler · on March 6, 2021

Lots of people mention Nx and of course I want to do the same but with a link to a youtube video, in case you wonder what's possible currently.

https://www.youtube.com/watch?v=fPKMmJpAGWc

joisig · on March 5, 2021

Take a look at https://github.com/elixir-nx/nx

machiaweliczny · on March 6, 2021

There's also Broadway for ETL

namelosw · on March 6, 2021

This is cool. It seems that it even supports Reason syntax.

There are really many languages start targeting BEAM lately. Quite a pleasant trend since Erlang was always underrated IMO.

vihangd · on March 6, 2021

Now only if we could have a decent OCaml/F# for the JVM too.. at least for GraalVM

lpedrosa · on March 6, 2021

Java as a language is becoming more and more like OCaml.

Once sealed types are introduced, we'll get closer and closer since you already have records and pattern matching.

Obviously, it'll never be the same. Java will always have it's ways of doing things, the community, etc.

I don't know who said it, but it was probably here in HN, where "if you squint very hard, you'll see what Brian et al are trying to accomplish". This was regarding the MLfication of Java.

darksaints · on March 6, 2021

I know it's not OCaml, but you should really give Scala a try. Martin Odersky has never been shy about his love for SML and OCaml, and it really shines through in the semantics found in Scala.

programLyrique · on March 6, 2021

There is ocamljava: http://www.ocamljava.org/

But it does not seem to be maintained.

toolz · on March 5, 2021

being unable to do dynamic dispatch seems like a big show-stopper for distributed/meshed apps, which seems to be the only reason for targeting the BEAM to begin with, no?

If you don't pass functions around as MFAs in a distributed app you run into trouble with upgrading your cluster as they'll have different versions of a function during a rolling deploy

jolux · on March 5, 2021

Different organizations use different amounts of the tools the BEAM provides in building their distributed applications. We containerize everything and don't really use the hot upgrade features.

toolz · on March 5, 2021

sure, but I'm not talking about hot upgrades, I'm talking about upgrading entire nodes that exist in a cluster. The nodes will not be able to pass around functions as an upgraded node in a cluster will have a different version of that function. So this is an issue without considering hot code reloading, as I don't use that feature either.

yawaramin · on March 5, 2021

So we're not talking about hot code reloading here, we're talking exactly about upgrading nodes in a cluster. Many--I'd say most--Erlang/Elixir deployments today don't do that. They use Kubernetes.

toolz · on March 5, 2021

Kubernetes does not solve the issue, it merely has strategies for how to upgrade nodes. The issue is that when you compile new code, even if the function didn't change at all, it will be a new version of the function. If the upgraded node then sends a function to another node to be executed (as is a very common practice in meshed OTP apps) the older node will not be able to execute the function.

Now, I understand that most erlang apps still don't utilize clustering and instead are stateless apps and that's fine, but ocaml can already do stateless apps just fine, right? So my assumption here is that they're putting ocaml on the BEAM to utilize OTP and clustered apps, correct? If that's the case I don't see how you could have a real-world app that you can quickly scale up/down (which is something you often want in a clustered app) - instead it seems like you'd be forced to spin up entirely new clusters and then direct traffic to the new cluster to bypass this problem. This of course isn't a viable tradeoff for very large clusters.

derefr · on March 5, 2021

I think what the parent is saying isn't that most production Erlang apps aren't clustered; it's that most production Erlang apps aren't clustered using the distribution protocol. Instead, they're clustered at the application layer, with explicit wire protocols like gRPC. You can upgrade nodes within such clusters just fine, because the ABI of the wire protocol is an explicit part of the application, something that can be stabilized separately from the app itself, rather than being an implicit property of what ERTS version you're using, or about how the application's processes communicate internally.

Note that even in the context of Ericsson, Erlang's distribution protocol was never designed for the use-case of horizontal N-node clustering. It was designed for static-role clustering — where each node is like an organ in your body, with a name and a specific function relative to the "system" that is your body. For example, "master" and "warm standby" roles. (By analogy to Kubernetes, the Erlang nodes of a distributed-OTP architecture are closest to being like the sibling containers of a single k8s pod. Except that Erlang's "containers" can be running on separate machines while still being part of the same "pod".)

If you want to scale such a system (a "pod" of nodes), you are supposed to spawn N copies of the entire distribution set; and then those N system-copies will coordinate with other such clones not via the distribution protocol, but via explicit coordination protocols. (Sometimes this explicit coordination protocol is designed to use the distribution protocol as a carrier — this is one of the reasons that Erlang supports making manual distribution-protocol connections between nodes, rather than forcing a connected topology on you. But such connections carefully avoid passing arbitrary RPC data across them, instead usually having an explicit "peer server" on both ends, where the servers speak a limited and ABI-stable term protocol to one another.)

Yes, the Erlang distribution protocol has been repeatedly taken beyond its design tolerances to accomplish horizontal scaling. CouchDB does this, for example. But many Erlang applications that you might think of as doing this, are actually much closer to static "organs in a body" architecture than you'd think, despite claims of scalability. Riak and Ejabberd, for two examples, both treat their nodes very much like static organs, rather than a fleet.

nivertech · on March 6, 2021

Erlang's built-in distribution works well for cluster management and low-traffic control plan, but not for the critical path / data plan [1].

In the later OTP releases Erlang distribution has become much better and become useful even for data plan in less demanding scenarios.

While it is possible to customize Erlang distribution for your specific network/workload, it is much easier to use plain TCP sockets for your data plan (back in the day I used 0MQ for data plan in the cluster).

Also riak and ejabberd (via Mnesia) use Erlang distribution for critical path and this was one of the reason for their poor performance.

I think you mean that ejabberd is not a cloud-native application (i.e. it treats nodes like pets instead of cattle). Modern BEAM apps are implementing cluster node-autodiscovery (I did it for my first Erlang service on AWS in 2010).

I'm not sure, but I think RabbitMQ uses the Erlang distribution for management only.

Phoenix apps using PubSub and Presence/Tracker will also use Erlang distribution by default, but it can be switched to use other backends, i.e Redis-based PubSub.

[1] https://www.slideshare.net/nivertech/erlang-on-osv-49278675#...

dnautics · on March 6, 2021

> "it's that most production Erlang apps aren't clustered using the distribution protocol"

Phoenix pubsub plugs into pg2 directly and automatically so if you've got an elixir cluster with phoenix you are basically "using erlang clustering" even if you don't realize you are.

yawaramin · on March 5, 2021

> Kubernetes does not solve the issue, it merely has strategies for how to upgrade nodes. The issue is that...

I understand the issue, Gleam/Caramel/et al. (static typing layers on BEAM) are not trying to solve this issue.

> So my assumption here is that they're putting ocaml on the BEAM to utilize OTP and clustered apps, correct?

That's actually not my assumption. I'm assuming that the point of Caramel is to get sound static typing for (some of) the code that we are deploying on the BEAM. It doesn't necessarily have to be for clustered apps. As I said, many deploys of even Erlang and Elixir today are not using clustered nodes. They don't care about those capabilities. They use Kubernetes to manage nodes and state.

> instead it seems like you'd be forced to spin up entirely new clusters and then direct traffic to the new cluster to bypass this problem.

This is exactly what Kubernetes (or Nomad etc.) will let you avoid--they can spin up and spin down nodes within the same cluster.

jolux · on March 5, 2021

As I understand it, there are real questions about the type theory needed to correctly express what the BEAM does, and they have yet to be answered sufficiently. Nonetheless, a lot of people clearly want static types in their BEAM applications, and there are certainly parts of any application that don't cross the network. I'm interested in efforts like Caramel and Gleam mostly because of my interest in BEAM as a general-purpose platform, and because I want to see if they come up with solutions to the type theory problems.

jolux · on March 5, 2021

I don't quite follow how static types change this situation?

macintux · on March 5, 2021

Obligatory list of alternative Erlang VM languages:

https://gist.github.com/macintux/6349828#alternative-languag...