In the impressive demo of OpenAI’s Codex programming system, the presenters suggest that Codex enables natural language to be used as a new kind of programming language.
Here’s how that might look, followed by one problem with the suggestion.
A natural-language programmer writes natural language prompts, and Codex transforms them into source code. The Codex model is made deterministic, such that rerunning the model on the prompts always produces the same source code. The user can make further modifications to the source code that the model produces. The result is two parallel streams of text: (1) the natural language, and (2) the final source code. These parallel streams can be saved in version control, shared, modified, remixed, and run.
A non-programmer might still be able to make modifications, simply by modifying the natural language text, without making modifications to the source code at all. This could be a way to learn a bit about programming for the curious non-programmer too, as they can observe the changes to the source code that result from changing the natural language text.
It is tempting to think that only the diff between the source code produced by the model and the final source code need be stored. However, this doesn’t hold up. The prompt when generating source code includes all earlier source code in the file. So if a programmer wants to modify a natural language prompt or piece of code earlier in a file, it could invalidate all the generated code later in the file. Rerunning the model on the new state of the program wouldn’t reproduce the source code verbatim, even with the model made deterministic.
The editor showing the program might be modified to reflect this. I envision it showing two columns, the left column with natural language text and the right column, aligned with the left, showing the source code. A block of text and code shows with a green background if the model, when run on the text, produces exactly the source code on the right. It shows with yellow if the source code produced by the model differs slightly from the actual code in the block (and the diff is highlighted). And it shows in red if no simple diff distinguishes the model’s output from the actual code in the block.