sveltejs · dummdidumm · Jul 10, 2021 · Jul 11, 2021 · Jul 14, 2021 · Aug 16, 2021
diff --git a/text/preprocessing-api-rework.md b/text/preprocessing-api-rework.md
@@ -0,0 +1,139 @@
+- Start Date: 2021-07.10
+- RFC PR: (leave this empty)
+- Svelte Issue: (leave this empty)
+
+# Preprocessing API rework
+
+## Summary
+
+Introduce a new preprocessing API which is simpler but allows for more flexibility
+
+## Motivation
+
+The current preprocessing API is both a little hard to grasp at first and not flexible enough to satisfy more advanced use cases. Its problems:
+
+- Ordering is somewhat arbitrary, as it runs markup preprocessors first, then script/style. Preprocessors that want to be executed at a different point are forced to do dirty workarounds. It also lead to a PR implementing somewhat of a escape hatch for this (https://github.com/sveltejs/svelte/pull/6031)
+- Script/Style preprocessors may want to remove attributes, right now it's not possible to do (unless they become a markup preprocessor and do it themselves somehow) (https://github.com/sveltejs/svelte-preprocess/issues/260, https://github.com/sveltejs/svelte/issues/5900)
+- In general, the distinction between markup/script/style forces a decision on the preprocessor authors that may lock them in to a suboptimal solution
+
+The solution for a better preprocessing API therefore should be
+
+- easier to grasp and reason about
+- execute preprocessors predictably
+- provide more flexibility
+
+## Detailed design
+
+The preprocessor API no longer is split up into three parts. Instead of expecting an object with `{script, style, markup}` functions, it expects an object with these properties:
+
+```typescript
+{
+  name: 'a string', // sometimes it's good to know from a preprocessor library persepective what preprocessor you are dealing with
-  name: 'a string', // sometimes it's good to know from a preprocessor library persepective what preprocessor you are dealing with
+  name: 'a string', // sometimes it's good to know from a preprocessor library perspective what preprocessor you are dealing with
-  name: 'a string', // sometimes it's good to know from a preprocessor library persepective what preprocessor you are dealing with
+  name: 'a string', // sometimes it's good to know from a preprocessor library perspective what preprocessor you are dealing with
+  preprocess: function,
+  // possibly more options later on
+}
+```
+
+The `preprocess` function to which the complete source code is handed looks like this:
+
+```typescript
+result: {
+	code: string,
+	dependencies: Array<string>
+} = await svelte.preprocess(
+    (input: { code: string, filename: string }) => Promise<{
+			code: string,
+			dependencies?: Array<string>,
+			map?: any
+		}>
+)
+```
+
+Additionally, `svelte/preprocess` exports new utility functions which essentially establish the current behavior:
+
+### extractStyles
+
+```typescript
+function extractStyles(code: string): Array<{
+  start: number;
+  end: number;
+  location: "top" | "nested" | "unknown";
+  content: { text: string; start: number; end: number };
+  attributes: Array<{
+    name: string;
+    value: string;
+    start: number;
+    end: number;
+  }>;
+}>;
+```
+
+extracts the style tags from the source code, each with start/end position, content and attributes. `location` would require a parser which has to make some assumptions about the code being "standard JS/HTML-syntax compliant" (in the sense of opening closing brackets match etc). Pending PR for such a parser here: https://github.com/sveltejs/svelte/pull/6611 . We could make using that parser the default, with a fallback to the old regex-approach in case of an error, in which case `location` would be `unknown`, not `top` or `nested`. Preprocessors could use this to only transpile top level scripts.
+
+### extractScripts
+
+Same as `extractStyles` but for scripts
+
+### replaceInCode
+
+```typescript
+function replaceInCode(
+  code: string,
+  replacements: Array<{ code: string; start: number; end: number; map?: SourceMap }>
+): { code: string; map: SourceMap };
+```
+
+Performs replacements at the specified positions. If a map is given, that map is adjusted to map whole content, not just the part that was processed. The result is the replaced code along with a merged map.
+
+These three functions would make it possible to reimplement a script preprocessor like this:
+
+```javascript
+function transformStuff(...) { /* user provided function */ }
+function getDependencies(...) { /* user provided function */ }
+function script({code}) {
+    const scripts = extractScripts(code);
+    const replacements = scripts.map(transformStuff);
+    return {
+        ...replaceInCode(code, replacements),
+        dependencies: getDependencies(replacements)
+    }
+}
+```
+
+Using these three functions, we could also construct convenience functions like `replaceInScript` which would make it possible for preprocessor authors to do `return replaceInScript(code, transformStuff)`. What functions exactly to provide is up for discussion, the point is that we should provide primitives to ensure more flexibility for advanced use cases.
+
+Since preprocessors are now "just" functions, there's no ordering headache anymore, preprocessors are invoked in order, giving full control for composability.
+
+### Roadmap, backwards-compatibility
+
+This new functionality could be implemented in Svelte 3, where the `preprocess` function checks if the passed in preprocessor is an object (current API) or a function (proposed API). In Svelte 4, this would become the default, and we could provide a function for preprocessors that don't support the new API yet.
+
+```javascript
+export function legacyPreprocessor(preprocessor) {
+  return async ({ code, filename }) => {
+    const processedBody = await (preprocessor?.markup({ content, filename }) ??
+      Promise.resolve({ code: content }));
+
+    // .. etc
+    return mergeParts(processedBody, processedScript, processedStyle);
+  };
+}
+```
+
+## How we teach this
+
+- Adjust docs
+- Write a blog post outlining the need for the change, what new capabilities it unlocks, and a migration path
+
+## Drawbacks
+
+None that I can think of right now
+
+## Alternatives
+
+None that I can think of right now
+
+## Unresolved questions
+
+- What about preprocessors inside mustache tags? Should the Svelte parser be adjusted for an opt-in parsing mode where a Javascript-like syntax for mustache tags is assumed to extract its contents and provide this as another utility function for preprocessing? (https://github.com/sveltejs/svelte/issues/4701) - _Update_: The pending parser PR would partly adress this by giving back location infos for mustache tags as well: https://github.com/sveltejs/svelte/pull/6611
+- Other preprocessor utils that are of use, for example doing the opposite of `extractScripts`/`extractStyles` and provide a `replaceMarkup` function (https://github.com/sveltejs/svelte/issues/5005)?