In my experience there's really only three true prompt engineering techniques: -...

dachris · 2025-06-05T04:36:32 1749098192

Context is king.

Start out with Typescript and have it answer data science questions - won't know its way around.

Start out with Python and ask the same question - great answers.

LLMs can't (yet) really transfer knowledge between domains, you have to prime them in the right way.

christophilus · 2025-06-05T11:54:23 1749124463

Dunno. I was working on a side project in TypeScript, and couldn’t think of the term “linear regression”. I told the agent, “implement that thing where you have a trend line through a dot cloud”, or something similarly obtuse, and it gave me a linear regression in one shot.

I’ve also found it’s very good at wrangling simple SQL, then analyzing the results in Bun.

I’m not doing heavy data processing, but so far, it’s remarkably good.

whoknowsidont · 2025-06-05T16:26:52 1749140812

Linear regression is a non-niche, well understood topic that's used in many other domains other than data science.

However, asking it to implement "that thing that groups data points into similar groups" needs a bit more context (I just tried it) as K-means is very much specific to machine learning.

throw2342412314 · 2025-06-05T21:38:09 1749159489

From a fresh session:

initial prompt: Start a new typescript file. It will be used for data science purposes

second prompt: Implement that "that thing that groups data points into similar groups"

The output was a full function implementing K-means (along with a Euclidean distance function).

https://chatgpt.com/share/68420bd4-113c-8006-a7fe-c2d0c9f91d...

whoknowsidont · 2025-06-06T14:49:30 1749221370

>It will be used for data science purposes

Doesn't this ruin / ignore the point we're discussing? I don't think anyone thought otherwise?

nxobject · 2025-06-05T23:08:33 1749164913

I see that as applying to niche platforms/languages without large public training datasets - if Rust was introduced today, the productivity differential would be so stacked against it that I’m not sure it would hypothetically survive.

0points · 2025-06-05T12:59:53 1749128393

That's your made up magical explanation right there dude.

Every day tech broism gets closer to a UFO sect.

LPisGood · 2025-06-05T15:29:10 1749137350

I think it’s not really a magical explanation; it’s pretty grounded in how LLMs work.

Obviously how exactly they work still isn’t fully explained, but calling basic principles magical is too far in my opinion.

apwell23 · 2025-06-05T20:23:41 1749155021

> how LLMs work.

maybe to gp but i thought no one knows how they work.

lexandstuff · 2025-06-04T23:08:18 1749078498

Even role prompting is totally useless imo. Maybe it was a thing with GPT3, but most of the LLMs already know they're "expert programmers". I think a lot of people are just deluding themselves with "prompt engineering".

Be clear with your requirements. Add examples, if necessary. Check the outputs (or reasoning trace if using a reasoning model). If they aren't what you want, adjust and iterate. If you still haven't got what you want after a few attempts, abandon AI and use the reasoning model in your head.

dimitri-vs · 2025-06-05T02:22:01 1749090121

It's become more subtle but still there. You can bias the model towards more "expert" responses with the right terminology. For example, a doctor asking a question will get a vastly different response than a normal person. A query with emojis will get more emojis back. Etc.

didgeoridoo · 2025-06-05T11:45:24 1749123924

This is definitely something I’ve noticed — it’s not about naïve role-priming at all, but rather about language usage.

“You are an expert doctor, help me with this rash I have all over” will result in a fairly useless answer, but using medical shorthand — “pt presents w bilateral erythema, need diff dx” — gets you exactly what you’re looking for.

james_marks · 2025-06-05T14:20:11 1749133211

If this holds up, it’s an interesting product idea you could MVP in a day.

Lay person’s description -> translate into medical shorthand -> get the expert response in shorthand -> translate back.

Liability and error is obviously problematic.

easyThrowaway · 2025-06-05T12:40:42 1749127242

I get the best results with Claude by treating the prompt like a pseudo-SQL language, treating words like "consider" or "think deeply" like keywords in a programming language. Also making use of their XML tags[1] to structure my requests.

I wouldn't be surprised if in a few years from now some sort of actual formalized programming language for "gencoding" AI is gonna emerge.

[1]https://docs.anthropic.com/en/docs/build-with-claude/prompt-...

petesergeant · 2025-06-05T11:47:01 1749124021

One thing I've had a lot of success with recently is a slight variation on role-prompting: telling the LLM that someone else wrote something, and I need their help assessing the quality of it.

When the LLM thinks _you_ wrote something, it's nice about it, and deferential. When it thinks someone else wrote it, you're trying to decide how much to pay that person, and you need to know what edits to ask for, it becomes much more cut-throat and direct.

dwringer · 2025-06-05T16:44:04 1749141844

I notice this to affect its tendency to just make things up in other contexts, too. I asked it to take a look at "my" github, gave it a link, then asked it some questions; it started talking about completely different repos and projects I never heard of. When I simply said take a look at `this` github and gave it a link, its answers had a lot more fidelity to what was actually there (within limits of course - it's still far from perfect) [This was with Gemini Flash 2.5 on the web]. I have had simlar experiences asking it to do style transfer from an example of "my" style versus "this" style, etc. Presumably this has something to do with the idea that in training, every text that speaks in first person is in some sense seen as being from the same person.

coolKid721 · 2025-06-05T13:38:04 1749130684

The main thing I think is people just trying to do everything in "one prompt" or one giant thing throwing all the context at it. What you said is correct but also, instead of making one massive request breaking it down into parts and having multiple prompts with smaller context that say all have structured output you feed into each other.

Make prompts focused with explicit output with examples, and don't overload the context. Then the 3 you said basically.

denhaus · 2025-06-05T00:27:38 1749083258

Regarding point 3, my colleagues and i studied this for a use case in science: https://doi.org/10.1038/s41467-024-45563-x

caterama · 2025-06-05T01:17:21 1749086241

Can you provide a "so what?" summary?

melagonster · 2025-06-05T03:12:17 1749093137

>We test three representative tasks in materials chemistry: linking dopants and host materials, cataloging metal-organic frameworks, and general composition/phase/morphology/application information extraction. Records are extracted from single sentences or entire paragraphs, and the output can be returned as simple English sentences or a more structured format such as a list of JSON objects. This approach represents a simple, accessible, and highly flexible route to obtaining large databases of structured specialized scientific knowledge extracted from research papers.

denhaus · 2025-06-09T02:35:32 1749436532

Short answer: It’s a way to generate structured databases for (most) scientific topics. Why? Apply data driven methods to these databases. So what? It’s a powerful way to ask and investigate scientific questions/trends otherwise hidden inside a million scientific papers.

Example: Consider what PDB has done for our understanding of protein folding, as well as the ML/computational techniques they’ve enabled (eg, Alphafold). Most scientific questions and properties are not as data-rich as protein folding. What if they could be?

Longer answer: The last 15 years in computational/ML + science have shown that structured databases open up entirely new frontiers in discovery (eg Protein Data Bank, Materials Project). But most scientific topics/properties are NOT in structured DBs, they’re scattered about in millions of papers. It’s especially a huge problem in some topics in materials science. It’s not that these problems are data scarce, but that it’s hard to actually collate their data in a structured format. You literally cannot use most ML methods because structured DBs do not exist.

This paper is a way to generate massive structured databases of specialized, intricate, and hierarchical knowledge graphs from scientific literature. Fine tuning works, prompt engineering does not (at the time, perhaps this has changed). Once you have a database, you can analyze an entire subfield or topic in science with ML or stats methods.

denhaus · 2025-06-05T00:28:42 1749083322

As a clarification, we used fine tuning more than prompt engineering because low or few-shot prompt engineering did not work for our use case.

Esophagus4 · 2025-06-07T12:46:03 1749300363

Chain Of Thought prompting loses much of its effectiveness on newer reasoning models like GPT “o” series and Claude Sonnet.

As an exercise for the reader, I encourage you all to try the examples vs. control prompts in prompt engineering papers for chain of thought prompting, and you’ll see that the latest models have either been trained to or instructed to reason by default now - the outputs are close enough to equivalent.

CoT prompting was probably much more effective a few years ago on older, less powerful models.

You may find some benefit in telling it exactly how you want it to reason about a problem, but note that you may actually be limiting its capabilities that way.

I’ve found that most of the time, I will let it use its default reasoning capabilities and guide those rather than supplying my own.

faustocarva · 2025-06-04T22:39:51 1749076791

Did you find it hard to create structured output while also trying to make it reason in the same prompt?

demosthanos · 2025-06-05T00:16:42 1749082602

You use a two-phase prompt for this. Have it reason through the answer and respond with a clearly-labeled 'final answer' section that contains the English description of the answer. Then run its response through again in JSON mode with a prompt to package up what the previous model said into structured form.

The second phase can be with a cheap model if you need it to be.

faustocarva · 2025-06-05T01:09:07 1749085747

Great, will try this! But, in a chain-based prompt or full conversational flow?

demosthanos · 2025-06-05T02:12:25 1749089545

You can do this conversationally, but I've had the most success with API requests, since that gives you the most flexibility.

Pseudo-prompt:

Prompt 1: Do the thing, describe it in detail, end with a clear summary of your answer that includes ${THINGS_YOU_NEED_FOR_JSON}.

Prompt 2: A previous agent said ${CONTENT}, structure as JSON according to ${SCHEMA}.

Ideally you use a model in Prompt 2 that supports JSON schemas so you have 100% guarantee that what you get back parses. Otherwise you can implement it yourself by validating it locally and sending the errors back with a prompt to fix them.

faustocarva · 2025-06-05T09:32:15 1749115935

Thanks!