Call remote tasks of service workspace #75

Can we avoid inheriting both client and server plugins for each workspace? Context configuration is more or less serializeable, so it is better to share only configuration, not duplicate plugins.

altavir (Migrated from github.com) commented

2022-05-04 17:57:32 +03:00

There is already WorkspacePlugin that does similar thing.

dataforge-workspace/src/jvmMain/kotlin/space/kscience/dataforge/workspace/distributed/RemoteTask.kt

altavir (Migrated from github.com) commented

2022-05-04 18:23:28 +03:00

Is it possible to avoid the cast? For example specialize output for this workspace?

dataforge-workspace/src/jvmMain/kotlin/space/kscience/dataforge/workspace/distributed/ServiceWorkspace.kt

altavir (Migrated from github.com) commented

2022-05-04 20:33:41 +03:00

Better to pass Adress from outside instead of constructor initialization.

altavir (Migrated from github.com) commented

2022-05-04 20:34:03 +03:00

Why?

altavir (Migrated from github.com) commented

2022-05-04 20:34:30 +03:00

There is existing one

winter-yuki (Migrated from github.com) reviewed 2022-05-09 19:26:55 +03:00

dataforge-workspace/src/jvmMain/kotlin/space/kscience/dataforge/workspace/distributed/ClientWorkspacePlugin.kt

winter-yuki (Migrated from github.com) commented

2022-05-09 19:26:55 +03:00

KType is not serializable. Maybe it is possible to partially serialize remote plugin configuration, but I see no way to avoid a little bit of configuration:

private class RemoteMyPlugin(endpoint: Endpoint) : ClientWorkspacePlugin(
    MyPlugin.tag,
    endpoint,
    "task".asName() to typeOf<Int>()
)

KType is not serializable. Maybe it is possible to partially serialize remote plugin configuration, but I see no way to avoid a little bit of configuration: ```kotlin private class RemoteMyPlugin(endpoint: Endpoint) : ClientWorkspacePlugin( MyPlugin.tag, endpoint, "task".asName() to typeOf<Int>() ) ```

winter-yuki (Migrated from github.com) reviewed 2022-05-09 19:29:38 +03:00

dataforge-workspace/src/jvmMain/kotlin/space/kscience/dataforge/workspace/distributed/ClientWorkspacePlugin.kt

winter-yuki (Migrated from github.com) commented

2022-05-09 19:29:38 +03:00

WorkspacePlugin is also a TaskContainer, which I did not want to. ClientPlugin is just a port to existing remote plugin and has no tasks locally.

`WorkspacePlugin` is also a `TaskContainer`, which I did not want to. `ClientPlugin` is just a port to existing remote plugin and has no tasks locally.

winter-yuki (Migrated from github.com) reviewed 2022-05-09 19:48:38 +03:00

dataforge-workspace/src/jvmMain/kotlin/space/kscience/dataforge/workspace/distributed/ServiceWorkspace.kt

winter-yuki (Migrated from github.com) commented

2022-05-09 19:48:37 +03:00

I think that may be not a good idea to expose lambdarpc dependency

winter-yuki (Migrated from github.com) reviewed 2022-05-09 19:48:53 +03:00

dataforge-workspace/src/jvmMain/kotlin/space/kscience/dataforge/workspace/distributed/ServiceWorkspace.kt

winter-yuki (Migrated from github.com) commented

2022-05-09 19:48:53 +03:00

Just temporary hole

winter-yuki (Migrated from github.com) reviewed 2022-05-09 19:49:39 +03:00

dataforge-workspace/src/jvmMain/kotlin/space/kscience/dataforge/workspace/distributed/ServiceWorkspace.kt

winter-yuki (Migrated from github.com) commented

2022-05-09 19:49:39 +03:00

Yes, but it is private for some reason

altavir commented

2022-05-24 12:18:29 +03:00

(Migrated from github.com)

All contributions should target dev, so please resolve conflicts.

All contributions should target `dev`, so please resolve conflicts.

altavir (Migrated from github.com) requested changes 2022-05-24 12:43:51 +03:00

altavir (Migrated from github.com) left a comment

Major cleanup is required.

You can switch to 0.6 branch. It fixes problems with needless suspend factory functions and introduces context receivers.

Major cleanup is required. You can switch to 0.6 branch. It fixes problems with needless suspend factory functions and introduces context receivers.

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/RemotePlugin.kt

altavir (Migrated from github.com) commented

2022-05-24 12:22:09 +03:00

I still do not understand why do we need wrappers for plaguns. Plagins are only a way to load tasks into workspace. Ther should be a wrapper for a task, not a plugin.

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/ServiceWorkspace.kt

altavir (Migrated from github.com) commented

2022-05-24 12:28:34 +03:00

DataTree builder will be non-suspending in the next version, but it should be possible to create it without suspending now.

altavir (Migrated from github.com) commented

2022-05-24 12:29:17 +03:00

Need to think about better naming

altavir (Migrated from github.com) commented

2022-05-24 12:29:54 +03:00

Why not suspended?

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/serialization/MetaCoder.kt

altavir (Migrated from github.com) commented

2022-05-24 12:31:49 +03:00

Is it possible to replace manual 'Coder's with KSerializer or IOFormat? Or make converters?

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/serialization/dataSerialization.kt

altavir (Migrated from github.com) commented

2022-05-24 12:40:16 +03:00

There is existing class for that

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/serialization/dataSerialization.kt

						
				@ -0,0 +13,4 @@

				 * [Data] representation that is trivially serializable.

				 */

				@Serializable

				internal data class DataPrototype(

altavir (Migrated from github.com) commented

2022-05-24 12:39:56 +03:00

This is unusable. I believe that it should be something like this:

class SerializeableData<T>(val data: Data<T>, val serializer: KSerializer<T>): Data<T> by data

This is unusable. I believe that it should be something like this: ```kotlin class SerializeableData<T>(val data: Data<T>, val serializer: KSerializer<T>): Data<T> by data ```

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/serialization/dataSetSerialization.kt

altavir (Migrated from github.com) commented

2022-05-24 12:42:07 +03:00

Again, you do not need a separate structure. All you need is a generic DataSet with serializer.

dataforge-distributed/src/jvmTest/kotlin/space/kscience/dataforge/distributed/RemoteCallTest.kt

altavir (Migrated from github.com) commented

2022-05-24 12:42:35 +03:00

switch to runTest

winter-yuki (Migrated from github.com) reviewed 2022-05-25 18:01:55 +03:00

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/ServiceWorkspace.kt

winter-yuki (Migrated from github.com) commented

2022-05-25 18:01:55 +03:00

I thought about WorkspaceNode or WorkerWorkspace. There is also DistributedWorkspace but it is not truly distributed itself.
Also ServiceWorkpace is good enought to my opinion. "Service" here means that this workspace should be run on some endpoint to be available.

I thought about `WorkspaceNode` or `WorkerWorkspace`. There is also `DistributedWorkspace` but it is not truly distributed itself. Also `ServiceWorkpace` is good enought to my opinion. "Service" here means that this workspace should be run on some endpoint to be available.

winter-yuki (Migrated from github.com) reviewed 2022-05-25 18:04:13 +03:00

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/ServiceWorkspace.kt

winter-yuki (Migrated from github.com) commented

2022-05-25 18:04:12 +03:00

Because it is blocking in the google gRPC implementation. It can be made suspend and block separate thread but I do not see any reasons to do so.

winter-yuki (Migrated from github.com) reviewed 2022-05-25 18:04:38 +03:00

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/serialization/MetaCoder.kt

winter-yuki (Migrated from github.com) commented

2022-05-25 18:04:38 +03:00

Yes, it already exist, I just forgot about it.

winter-yuki (Migrated from github.com) reviewed 2022-05-25 20:49:44 +03:00

dataforge-distributed/src/jvmTest/kotlin/space/kscience/dataforge/distributed/RemoteCallTest.kt

winter-yuki (Migrated from github.com) commented

2022-05-25 20:49:44 +03:00

This function is not available in dataforge, shall I add some dependency to the project?

winter-yuki (Migrated from github.com) reviewed 2022-05-25 20:53:44 +03:00

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/serialization/dataSetSerialization.kt

winter-yuki (Migrated from github.com) commented

2022-05-25 20:53:44 +03:00

When ServiceWorkspace.execute returns result to the client, it do not know yet about serializer for the T. Then RemoteTask uses serializer to deserialize DataSet from prototype.

When `ServiceWorkspace.execute` returns result to the client, it do not know yet about serializer for the `T`. Then `RemoteTask` uses serializer to deserialize DataSet from prototype.

winter-yuki (Migrated from github.com) reviewed 2022-05-25 21:02:22 +03:00

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/RemotePlugin.kt

winter-yuki (Migrated from github.com) commented

2022-05-25 21:02:22 +03:00

But where should endpoints for the tasks be provided? Seems that task declaration is not good enough place. Should I use Meta for this?

altavir commented

2022-06-03 18:42:58 +03:00

(Migrated from github.com)

@winter-yuki is it ready for review?

winter-yuki commented

2022-06-03 19:01:41 +03:00

(Migrated from github.com)

@winter-yuki is it ready for review?

No, I need a bit of time. Maybe tomorrow will be ready, I'll rerequest review

> @winter-yuki is it ready for review? No, I need a bit of time. Maybe tomorrow will be ready, I'll rerequest review

👍 1

winter-yuki commented

2022-06-04 18:37:25 +03:00

(Migrated from github.com)

Maybe I should also do:

Move to square wire to remove lambdarpc dependency. But I failed to add it's plugin to the project, gradlle cannot find it and seems that it does not know any plugins repo at all. Meybe it is mpp plugin issue.
Implement simple result caching as a file directory on some machine.
Implement data-parallelism somehow because it is the main application for the distributed processing in my point of view.

Maybe I should also do: - Move to `square wire` to remove lambdarpc dependency. But I failed to add it's plugin to the project, gradlle cannot find it and seems that it does not know any plugins repo at all. Meybe it is mpp plugin issue. - Implement simple result caching as a file directory on some machine. - Implement data-parallelism somehow because it is the main application for the distributed processing in my point of view.

altavir (Migrated from github.com) reviewed 2022-06-22 18:59:26 +03:00

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/NodeWorkspace.kt

						
				@ -0,0 +27,4 @@

				    private val dataSerializer: KSerializer<Any>? = null,

				    data: DataSet<*> = DataTree<Any>(),

				    targets: Map<String, Meta> = mapOf(),

				) : Workspace by RemoteTaskWorkspace(context, data, targets), ServiceWorkspace {

altavir (Migrated from github.com) commented

2022-06-22 18:50:33 +03:00

Why do we need such complex construct. Isn't it possible to add a simple wrapper by adding a connection and serializator resolver. Something like this for server:.

/**
 * Expose existing workspace to external connections. Use Job to handle the lifecycle.
 * Serializers are resolved dynamically when appropriate task is called. If serializer is not found,
 * a "soft" error is produced.
 * /
fun Workspace.serve(serializatorResolver: (KType)->KSerializer<*>): Job

And for client:

fun Workspace.Companion.connect(ip: String, port: Int): Workspace

We can add a close method to a generic Workspace to handle the lifecycle.

Why do we need such complex construct. Isn't it possible to add a simple wrapper by adding a connection and serializator resolver. Something like this for server:. ```kotlin /** * Expose existing workspace to external connections. Use Job to handle the lifecycle. * Serializers are resolved dynamically when appropriate task is called. If serializer is not found, * a "soft" error is produced. * / fun Workspace.serve(serializatorResolver: (KType)->KSerializer<*>): Job ``` And for client: ```kotlin fun Workspace.Companion.connect(ip: String, port: Int): Workspace ``` We can add a `close` method to a generic Workspace to handle the lifecycle.

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/ServiceWorkspace.kt

						
				@ -0,0 +12,4 @@

				/**

				 * [Workspace] that can expose its tasks to other workspaces as a service.

				 */

				public interface ServiceWorkspace : Workspace, Closeable {

altavir (Migrated from github.com) commented

2022-06-22 18:53:14 +03:00

I do not see, why we need it in the API on the server side the connection should be closeable, not the workspace. Maybe you should rename it to connection and remove inheritance from a workspace?

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/serialization/dataSerialization.kt

						
				@ -0,0 +13,4 @@

				 * [Data] representation that is trivially serializable.

				 */

				@Serializable

				internal data class DataPrototype(

altavir (Migrated from github.com) commented

2022-06-22 18:53:56 +03:00

Is it used somewhere?

dataforge-distributed/src/jvmMain/kotlin/space/kscience/dataforge/distributed/serialization/dataSetSerialization.kt

						
				@ -0,0 +19,4 @@

				 * [DataSet] representation that is trivially serializable.

				 */

				@Serializable

				internal data class DataSetPrototype(val meta: Meta, val data: Map<String, DataPrototype>) {

altavir (Migrated from github.com) commented

2022-06-22 18:55:42 +03:00

Do we need that?

dataforge-distributed/src/jvmTest/kotlin/space/kscience/dataforge/distributed/RemoteCallTest.kt

						
				@ -0,0 +26,4 @@

				    @BeforeAll

				    fun before() = runBlocking {

				        worker1 = NodeWorkspace(

				            context = Global.buildContext("worker1".asName()) {

altavir (Migrated from github.com) commented

2022-06-22 18:56:43 +03:00

This is outdated, use Context() factory function instead.

This is outdated, use `Context()` factory function instead.

altavir closed this pull request

2023-11-26 10:00:56 +03:00

Pull request closed

This pull request cannot be reopened because the branch was deleted.

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: kscience/dataforge-core#75