Call remote tasks of service workspace #75

Closed
winter-yuki wants to merge 25 commits from winter-yuki/distributed into 0.6
winter-yuki commented 2022-04-30 19:38:27 +03:00 (Migrated from github.com)

Remote tasks can be called now

Remote tasks can be called now
winter-yuki (Migrated from github.com) reviewed 2022-05-01 10:23:37 +03:00
winter-yuki (Migrated from github.com) reviewed 2022-05-01 10:24:33 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-01 10:24:32 +03:00

Use ConnectionPool

Use `ConnectionPool`
altavir (Migrated from github.com) reviewed 2022-05-04 20:35:06 +03:00
altavir (Migrated from github.com) left a comment

Some comments, not a final review (had not time to read the second part.

Some comments, not a final review (had not time to read the second part.
altavir (Migrated from github.com) commented 2022-05-04 17:53:03 +03:00

Coroutines are alredy in dataforge-context

Coroutines are alredy in dataforge-context
altavir (Migrated from github.com) commented 2022-05-04 17:53:25 +03:00

Should not be in basic workspace module. Should be a separate module.

Should not be in basic workspace module. Should be a separate module.
altavir (Migrated from github.com) commented 2022-05-04 17:56:58 +03:00

Can we avoid inheriting both client and server plugins for each workspace? Context configuration is more or less serializeable, so it is better to share only configuration, not duplicate plugins.

Can we avoid inheriting both client and server plugins for each workspace? Context configuration is more or less serializeable, so it is better to share only configuration, not duplicate plugins.
altavir (Migrated from github.com) commented 2022-05-04 17:57:32 +03:00

There is already WorkspacePlugin that does similar thing.

There is already WorkspacePlugin that does similar thing.
altavir (Migrated from github.com) commented 2022-05-04 18:23:28 +03:00

Is it possible to avoid the cast? For example specialize output for this workspace?

Is it possible to avoid the cast? For example specialize output for this workspace?
altavir (Migrated from github.com) commented 2022-05-04 20:33:41 +03:00

Better to pass Adress from outside instead of constructor initialization.

Better to pass Adress from outside instead of constructor initialization.
altavir (Migrated from github.com) commented 2022-05-04 20:34:03 +03:00

Why?

Why?
altavir (Migrated from github.com) commented 2022-05-04 20:34:30 +03:00

There is existing one

There is existing one
winter-yuki (Migrated from github.com) reviewed 2022-05-09 19:26:55 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-09 19:26:55 +03:00

KType is not serializable. Maybe it is possible to partially serialize remote plugin configuration, but I see no way to avoid a little bit of configuration:

private class RemoteMyPlugin(endpoint: Endpoint) : ClientWorkspacePlugin(
    MyPlugin.tag,
    endpoint,
    "task".asName() to typeOf<Int>()
)
KType is not serializable. Maybe it is possible to partially serialize remote plugin configuration, but I see no way to avoid a little bit of configuration: ```kotlin private class RemoteMyPlugin(endpoint: Endpoint) : ClientWorkspacePlugin( MyPlugin.tag, endpoint, "task".asName() to typeOf<Int>() ) ```
winter-yuki (Migrated from github.com) reviewed 2022-05-09 19:29:38 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-09 19:29:38 +03:00

WorkspacePlugin is also a TaskContainer, which I did not want to. ClientPlugin is just a port to existing remote plugin and has no tasks locally.

`WorkspacePlugin` is also a `TaskContainer`, which I did not want to. `ClientPlugin` is just a port to existing remote plugin and has no tasks locally.
winter-yuki (Migrated from github.com) reviewed 2022-05-09 19:48:38 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-09 19:48:37 +03:00

I think that may be not a good idea to expose lambdarpc dependency

I think that may be not a good idea to expose lambdarpc dependency
winter-yuki (Migrated from github.com) reviewed 2022-05-09 19:48:53 +03:00
winter-yuki (Migrated from github.com) reviewed 2022-05-09 19:49:39 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-09 19:49:39 +03:00

Yes, but it is private for some reason

Yes, but it is private for some reason
altavir commented 2022-05-24 12:18:29 +03:00 (Migrated from github.com)

All contributions should target dev, so please resolve conflicts.

All contributions should target `dev`, so please resolve conflicts.
altavir (Migrated from github.com) requested changes 2022-05-24 12:43:51 +03:00
altavir (Migrated from github.com) left a comment

Major cleanup is required.

You can switch to 0.6 branch. It fixes problems with needless suspend factory functions and introduces context receivers.

Major cleanup is required. You can switch to 0.6 branch. It fixes problems with needless suspend factory functions and introduces context receivers.
altavir (Migrated from github.com) commented 2022-05-24 12:22:09 +03:00

I still do not understand why do we need wrappers for plaguns. Plagins are only a way to load tasks into workspace. Ther should be a wrapper for a task, not a plugin.

I still do not understand why do we need wrappers for plaguns. Plagins are only a way to load tasks into workspace. Ther should be a wrapper for a task, not a plugin.
altavir (Migrated from github.com) commented 2022-05-24 12:28:34 +03:00

DataTree builder will be non-suspending in the next version, but it should be possible to create it without suspending now.

DataTree builder will be non-suspending in the next version, but it should be possible to create it without suspending now.
altavir (Migrated from github.com) commented 2022-05-24 12:29:17 +03:00

Need to think about better naming

Need to think about better naming
altavir (Migrated from github.com) commented 2022-05-24 12:29:54 +03:00

Why not suspended?

Why not suspended?
altavir (Migrated from github.com) commented 2022-05-24 12:31:49 +03:00

Is it possible to replace manual 'Coder's with KSerializer or IOFormat? Or make converters?

Is it possible to replace manual 'Coder's with KSerializer or IOFormat? Or make converters?
altavir (Migrated from github.com) commented 2022-05-24 12:40:16 +03:00

There is existing class for that

There is existing class for that
@ -0,0 +13,4 @@
* [Data] representation that is trivially serializable.
*/
@Serializable
internal data class DataPrototype(
altavir (Migrated from github.com) commented 2022-05-24 12:39:56 +03:00

This is unusable. I believe that it should be something like this:

class SerializeableData<T>(val data: Data<T>, val serializer: KSerializer<T>): Data<T> by data
This is unusable. I believe that it should be something like this: ```kotlin class SerializeableData<T>(val data: Data<T>, val serializer: KSerializer<T>): Data<T> by data ```
altavir (Migrated from github.com) commented 2022-05-24 12:42:07 +03:00

Again, you do not need a separate structure. All you need is a generic DataSet with serializer.

Again, you do not need a separate structure. All you need is a generic DataSet with serializer.
winter-yuki (Migrated from github.com) reviewed 2022-05-25 18:01:55 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-25 18:01:55 +03:00

I thought about WorkspaceNode or WorkerWorkspace. There is also DistributedWorkspace but it is not truly distributed itself.
Also ServiceWorkpace is good enought to my opinion. "Service" here means that this workspace should be run on some endpoint to be available.

I thought about `WorkspaceNode` or `WorkerWorkspace`. There is also `DistributedWorkspace` but it is not truly distributed itself. Also `ServiceWorkpace` is good enought to my opinion. "Service" here means that this workspace should be run on some endpoint to be available.
winter-yuki (Migrated from github.com) reviewed 2022-05-25 18:04:13 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-25 18:04:12 +03:00

Because it is blocking in the google gRPC implementation. It can be made suspend and block separate thread but I do not see any reasons to do so.

Because it is blocking in the google gRPC implementation. It can be made suspend and block separate thread but I do not see any reasons to do so.
winter-yuki (Migrated from github.com) reviewed 2022-05-25 18:04:38 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-25 18:04:38 +03:00

Yes, it already exist, I just forgot about it.

Yes, it already exist, I just forgot about it.
winter-yuki (Migrated from github.com) reviewed 2022-05-25 20:49:44 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-25 20:49:44 +03:00

This function is not available in dataforge, shall I add some dependency to the project?

This function is not available in dataforge, shall I add some dependency to the project?
winter-yuki (Migrated from github.com) reviewed 2022-05-25 20:53:44 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-25 20:53:44 +03:00

When ServiceWorkspace.execute returns result to the client, it do not know yet about serializer for the T. Then RemoteTask uses serializer to deserialize DataSet from prototype.

When `ServiceWorkspace.execute` returns result to the client, it do not know yet about serializer for the `T`. Then `RemoteTask` uses serializer to deserialize DataSet from prototype.
winter-yuki (Migrated from github.com) reviewed 2022-05-25 21:02:22 +03:00
winter-yuki (Migrated from github.com) commented 2022-05-25 21:02:22 +03:00

But where should endpoints for the tasks be provided? Seems that task declaration is not good enough place. Should I use Meta for this?

But where should endpoints for the tasks be provided? Seems that task declaration is not good enough place. Should I use Meta for this?
altavir commented 2022-06-03 18:42:58 +03:00 (Migrated from github.com)

@winter-yuki is it ready for review?

@winter-yuki is it ready for review?
winter-yuki commented 2022-06-03 19:01:41 +03:00 (Migrated from github.com)

@winter-yuki is it ready for review?

No, I need a bit of time. Maybe tomorrow will be ready, I'll rerequest review

> @winter-yuki is it ready for review? No, I need a bit of time. Maybe tomorrow will be ready, I'll rerequest review
winter-yuki commented 2022-06-04 18:37:25 +03:00 (Migrated from github.com)

Maybe I should also do:

  • Move to square wire to remove lambdarpc dependency. But I failed to add it's plugin to the project, gradlle cannot find it and seems that it does not know any plugins repo at all. Meybe it is mpp plugin issue.
  • Implement simple result caching as a file directory on some machine.
  • Implement data-parallelism somehow because it is the main application for the distributed processing in my point of view.
Maybe I should also do: - Move to `square wire` to remove lambdarpc dependency. But I failed to add it's plugin to the project, gradlle cannot find it and seems that it does not know any plugins repo at all. Meybe it is mpp plugin issue. - Implement simple result caching as a file directory on some machine. - Implement data-parallelism somehow because it is the main application for the distributed processing in my point of view.
altavir (Migrated from github.com) reviewed 2022-06-22 18:59:26 +03:00
@ -0,0 +27,4 @@
private val dataSerializer: KSerializer<Any>? = null,
data: DataSet<*> = DataTree<Any>(),
targets: Map<String, Meta> = mapOf(),
) : Workspace by RemoteTaskWorkspace(context, data, targets), ServiceWorkspace {
altavir (Migrated from github.com) commented 2022-06-22 18:50:33 +03:00

Why do we need such complex construct. Isn't it possible to add a simple wrapper by adding a connection and serializator resolver. Something like this for server:.

/**
 * Expose existing workspace to external connections. Use Job to handle the lifecycle.
 * Serializers are resolved dynamically when appropriate task is called. If serializer is not found,
 * a "soft" error is produced.
 * /
fun Workspace.serve(serializatorResolver: (KType)->KSerializer<*>): Job

And for client:

fun Workspace.Companion.connect(ip: String, port: Int): Workspace

We can add a close method to a generic Workspace to handle the lifecycle.

Why do we need such complex construct. Isn't it possible to add a simple wrapper by adding a connection and serializator resolver. Something like this for server:. ```kotlin /** * Expose existing workspace to external connections. Use Job to handle the lifecycle. * Serializers are resolved dynamically when appropriate task is called. If serializer is not found, * a "soft" error is produced. * / fun Workspace.serve(serializatorResolver: (KType)->KSerializer<*>): Job ``` And for client: ```kotlin fun Workspace.Companion.connect(ip: String, port: Int): Workspace ``` We can add a `close` method to a generic Workspace to handle the lifecycle.
@ -0,0 +12,4 @@
/**
* [Workspace] that can expose its tasks to other workspaces as a service.
*/
public interface ServiceWorkspace : Workspace, Closeable {
altavir (Migrated from github.com) commented 2022-06-22 18:53:14 +03:00

I do not see, why we need it in the API on the server side the connection should be closeable, not the workspace. Maybe you should rename it to connection and remove inheritance from a workspace?

I do not see, why we need it in the API on the server side the connection should be closeable, not the workspace. Maybe you should rename it to connection and remove inheritance from a workspace?
@ -0,0 +13,4 @@
* [Data] representation that is trivially serializable.
*/
@Serializable
internal data class DataPrototype(
altavir (Migrated from github.com) commented 2022-06-22 18:53:56 +03:00

Is it used somewhere?

Is it used somewhere?
@ -0,0 +19,4 @@
* [DataSet] representation that is trivially serializable.
*/
@Serializable
internal data class DataSetPrototype(val meta: Meta, val data: Map<String, DataPrototype>) {
altavir (Migrated from github.com) commented 2022-06-22 18:55:42 +03:00

Do we need that?

Do we need that?
@ -0,0 +26,4 @@
@BeforeAll
fun before() = runBlocking {
worker1 = NodeWorkspace(
context = Global.buildContext("worker1".asName()) {
altavir (Migrated from github.com) commented 2022-06-22 18:56:43 +03:00

This is outdated, use Context() factory function instead.

This is outdated, use `Context()` factory function instead.
altavir closed this pull request 2023-11-26 10:00:56 +03:00

Pull request closed

Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: kscience/dataforge-core#75
No description provided.