space-export/README.md

71 lines
3.2 KiB
Markdown
Raw Normal View History

2023-09-01 19:57:27 +03:00
# Export everything from Space
2021-12-26 15:28:53 +03:00
2023-09-01 19:57:27 +03:00
The aim of this repository is to help to export your data from JetBrains Space usin Space SDK.
2021-12-26 15:28:53 +03:00
## Setting up Space Application
2022-04-27 09:26:24 +03:00
In order to access data in Space, one needs to [create a Space Application](https://www.jetbrains.com/help/space/applications.html) and add appropriate permissions. I am not sure which permissions cover access to image, but here are those that I allowed:
2021-12-26 15:28:53 +03:00
* Provide external attachment unfurls
* Provide external inline unfurls
* View project data
* View book metadata
* View content
2023-09-01 19:57:27 +03:00
For restricted projects, one needs to manually add the project and its permission.
2022-04-27 09:26:24 +03:00
2021-12-26 15:28:53 +03:00
Then one needs to copy `clientId` and `clientSecret` for the application and use them as command line parameters.
2023-09-01 19:57:27 +03:00
## Export documents
Initially, the main idea was to export Space documents. Those documents are written in MarkDown format and could include images and file references, but do not have a dedicated API to download them. In order to do that, one has to do several steps:
* Download a page as markdown to a directory.
* Download attached images to specific directory.
* Replace references to attachments in MarkDown files.
### Downloading texts
2021-12-26 15:28:53 +03:00
2022-04-27 09:26:24 +03:00
Text and binary documents are processed recursively starting at given `folderId` or project root if it is not defined.
2023-09-01 19:57:27 +03:00
### Download images
2021-12-26 15:28:53 +03:00
The images in space documents are inserted in the following format: `![](/d/aaaabbbbcccc?f=0 "name.png")`. Our aim is to detect those links in files and download appropriate images. Those links could not be replaced directly, because access requires OAuth authentication. For that we need to use access token from Space SDK.
2023-09-01 19:57:27 +03:00
### Replace references
After the file is successfully downloaded, the reference in file must be replaced with a local one.
2021-12-26 15:28:53 +03:00
2023-09-01 19:57:27 +03:00
### Document conversion with Pandoc
2021-12-26 15:28:53 +03:00
2023-09-01 19:57:27 +03:00
The package also includes an automatic conversion of documents via pandoc. See CLI keys reference for details.
2021-12-26 15:28:53 +03:00
2023-09-01 19:57:27 +03:00
### CLI for document download
The CLI for document extraction is the following:
```
./space-export docs --clientId <Client ID> --clientSecret <Client Secret> <optional keys> <mandatory Space page URL>
2022-04-27 09:26:24 +03:00
```
2023-09-01 19:57:27 +03:00
The URL could be either a folder page or a project page. If it is a project page, all documents in the project are exported.
## Export repositories
This is straight-forward. It scans projects for repositories and then clones them, using a system git and default user SSH key (it is possible to add custom SSH certificate in the future).
CLI is the same as for documents, but takes only project root as URL.
## Export chat history
2021-12-26 15:28:53 +03:00
2023-09-01 19:57:27 +03:00
Chat history is exported the same way as documents (including threads).
URL for chats is either a specific chat page (without threads for now) or a Space base URL (in this case, all chats will be exported).
## Export direct messages
Direct messages require different treatment because they require authorization on behalf of the user. In order to do so, one needs to create a personal token (Search for `Personal token` in the search) with global `View direct messages`, `View messages` and `View profile` access. Then use it with a `--token` key like this:
```
2023-09-06 14:45:18 +03:00
./space-export direct --token <Token string> <URL>
2021-12-26 15:28:53 +03:00
```
2023-09-01 19:57:27 +03:00
Url could be either a base space Url or an Url of the chat.