WIP: New drag and drop API#4571
Conversation
This changes the data transfer system to better match cross-platform semantics
|
I haven't read very much into impl. details, but from what I've seen:
In the current API, we have this URI list preloaded, but I think this is solved, we should only load with async-ish API once user asks, I think, if you want to follow Wayland. If the API is not lazy, I'm not sure how it should be done, so if yo can provide tl;dr it would help (haven't found clear wording in your research).
Well, because it's a lot of work, you need to negotiate what you want to read (image/text/html/whatever), then both ends should do non-blocking writing/reading to an FD, thus implying that you plug that FD into epoll based event loop or something. And ensure that you don't hang each other. And compositor does very little here actually, it just lets you exchange FDs with other end, but all this negotiation + writing the right mime type is on the client. So for example, when something drags object/pastes, you have a bunch of mime types you can use to query data, e.g. image/text/audio, then you ask for audio, and other end should provide audio. Then you want to initiate drag and drop, and you started dragging something, but this something is either text or image, and then depending on what the other end picks(e.g. you drag from winit to firefox, and firefox picks one of two mimetypes you gave to it), winit must reply with the right data. Note that you don't create buffer for both image and text before hand, you only create them once you get event what type of data other end wants. It can also ask you for both, or ask you something from time to time as long as you have advertised something. |
|
@kchibisov Thanks for your response. I appreciate you clarifying the way that data transfer (i.e. clipboard and drag-and-drop) works, but it might be worth reading through the PR since I have an extensive doc comment explaining the exact concerns that you bring up and how the API addresses them. In particular:
The implementation in this PR only addresses drag-and-drop, but it is specifically designed to support clipboard operations in the future. Clipboard operations are planned in a follow-up PR, and only left unimplemented for now for two reasons:
The type hierarchy is like so:
The API flow as-implemented by this PR is like so:
Note that we cannot just return a potentially-blocking Regarding moving |
But we can pass Also, X11 is nearly dead, so no point in design around it. |
|
Yeah, as of the latest few commits I’ve moved the whole drag-and-drop API to the event loop. Fetching still needs to be done asynchronously to support X11 though unfortunately, there’s not really a way around that without running the risk of deadlocking the event loop. The user can always immediately try to read the data without waiting for the I’ve also updated the dnd example to show how to use the new API. |
kchibisov
left a comment
There was a problem hiding this comment.
Generally using event will be fine, I guess. The API should certainly be async, sync won't work on Wayland as well.
|
|
||
| self.last_dnd_fetch = None; | ||
| }, | ||
| WindowEvent::DataTransferResult { serial, .. } => { |
There was a problem hiding this comment.
But do we really need it as Event or a series of callbacks on an ApplicationTrait? That way you can make this async API safe to use and also don't have to clone data around etc in some 'potential' cases.
There was a problem hiding this comment.
Like for example, on Wayland, you'll get transfer id + bunch of mime types, then user e.g. requests Utf8, and then event loop will slowly read it and call back to user with the type it requested or signal error if reading data failed (other end died during transaction), or maybe just callback with e.g. Vec<8> and a hint + transfer user used, so we move all convertions to user of the API.
That way we don't need to enforce UTF-8, I'd prefer winit to be non-opinionated here and just pass data around as it goes.
There was a problem hiding this comment.
So w.r.t. the UTF-8 side of things: that is already how it's implemented in winit on master. The cross-platform interface converts everything to UTF-8, since Rust strings are UTF-8. With the design in this PR, the user can downcast to the platform-specific types if they want to access something that isn't UTF-8 strings. That being said, I'm not happy with how the UTF-16 detection works inside the X11 implementation, that's still a work in progress.
It's not really possible to just unconditionally return a vector of bytes, or at least it's not very helpful to do so. On Wayland and X11, all clipboard/dnd data is ultimately just a binary blob, but that's simply not true for Windows or macOS. In particular, file URIs on both Windows and macOS are an array of individual items, each of which representing a single path. Even if all types could be exposed as a binary blob cross-platform, the encoding of strings differs between platforms. The user can't even just use OsStr or something like that since on X11, it's not guaranteed that strings will be UTF-8. Firefox transfers HTML as UTF-16, for example. Images can be exposed as raw bytes in a known format on every platform, and I believe that the same is true for audio although I'm not totally sure if that's true on macOS. Of the types that are supported cross-platform (the ones in TypeHint), AFAIK the only ones requiring special handling for cross-platform support are strings and file URIs, which is why they have special-cased helper methods in TypedData.
I've been talking with my colleagues, and I believe that it's not necessary to have this DataTransferResult event. It's only necessary on X11 to prevent deadlocking the event thread, but the way that Qt handles it is simply to drive the event loop inside their equivalent of fetch_data_transfer. It feels kinda gross and I wish it wasn't necessary, but the X11 spec mandates that the XConvertSelection message is handled synchronously, followed by sending SelectionNotify synchronously, so we can peek the upcoming messages inside fetch_data_transfer and if the source application sends some other non-SelectionNotify message instead then we can just return an error and return to the regular event loop.
For every other platform, the event is unnecessary. While all other platforms have some way of sending data asynchronously, none of them require actually driving the event loop to do so, so the application can read the TypedData directly in the event loop without deadlocking. It'll still potentially stall the event loop, but that's fine. If they don't want to stall, they can transfer the TypedData to another thread and handle it there.
There was a problem hiding this comment.
But Wayland mandates it? You need to async read the pipe which you negotiate in async manner as well and the thing can even die inside the negotiation. You also need async when you send data to yourself, because you'll overflow channel and need to release reader/writer (hence non blocking IO is mandatory and driving IO read/write by event loop is also mandatory).
Like I'm not sure how you'd do anything sync on Wayland.
There was a problem hiding this comment.
I mean, if you block on sending data to yourself then the application or UI framework developer has complete control over all the code in question and I’d argue it’s not winit's job to prevent that bug. The DataTransferResult event doesn’t even prevent that bug on Wayland since it only advertises that the data is ready to start reading, not that all data has been received. We should probably mention the chance of deadlocking in the docs, though.
Wayland has an async interface (as do all platforms, including macOS and Windows) but, unlike X11, it doesn’t require driving the event loop to receive data. You send a file descriptor to the source application and the source application writes to it. The fact that X11 requires driving the event loop is the only reason that the event is necessary, so I think that it should be up to the user whether the data transfer is read synchronously or asynchronously.
There was a problem hiding this comment.
But why use sync interface in the first place if it's error prone when we can do async which is safer to use 🤔 I guess so the users don't wait for the transaction to complete?
There was a problem hiding this comment.
The event that this thread is in reference to doesn't solve the issues mentioned above. The only issue it solves is specifically needing to drive the event loop between requesting a type from a data transfer and being able to access it, which is only necessary on X11 and has a workaround (that Qt uses, so clearly it's at least a functional solution). I'm completely in favour of having async IO methods on TypedData, e.g. try_async_read(&self) -> Option<Box<dyn AsyncRead>>, which would solve the issues you mention, but I think those methods should coexist with sync equivalents because of function colouring.
There was a problem hiding this comment.
So as of 98106b8, I've changed X11's weird behaviour to just result in runtime errors instead of leaking them into the API. Now the API is non-blocking, meaning error-returning methods can return io::ErrorKind::WouldBlock instead of waiting. If the user explicitly wants to block, they can call wait_for_data, which will return io::ErrorKind::Deadlock if waiting would deadlock on that platform (i.e. on X11).
This means that, so long as the user correctly handles errors, they have a path to support all backends, without needing to add new events to the API that only make sense on X11. This design has all the benefits of the previous event-driven design, but avoids leaking X11's weirdness into the API.
An async interface is still desirable, but since the rest of winit is still sync I think that's best left for a follow-up PR (or for an external crate, since the per-platform APIs are still accessible with this design). I think writing a Future/Stream/AsyncRead-based API for each platform is out of scope for this PR, custom Future implementations are extremely hard to get correct. As far as I can tell, the only platform that could easily return an AsyncRead for clipboard/drag-and-drop data is Wayland, since it just uses file handles. On every other platform it would have to be custom-implemented.
There was a problem hiding this comment.
A colleague pointed out that it's probably better to make wait_for_data an internal implementation detail of the X11 SelectionReader and remove it from the cross-platform trait. The user doesn't really have a reason to handle the WouldBlock (from the reader) and Deadlock (from wait_for_data) error cases differently and it's only a major problem on X11. On Wayland we choose how the file descriptor is implemented, so if there's a risk of deadlocks that's something that winit can solve internally.
This commit also adds deadlock detection to X11, since on that platform the event loop needs to be polled in order for the selection to be received.
ece1dfa to
98106b8
Compare
650e7ee to
2e99f38
Compare
Since we don't need internal mutability on `Dnd` any more.
For consistency with other platforms
This PR implements a new API for drag and drop, with a
DataTransfertype which abstracts over the various clipboard/drag and drop APIs across different platforms. I built this on top of #2429 in order to ensure @SludgePhD gets credit if it gets merged, although admittedly I ended up removing pretty much all of their work while I was reworking the design.This is being built in order to help support drag-and-drop work in Slint's winit backend. As part of that work, I did extensive research on how drag-and-drop and clipboard APIs are implemented across different platforms, and wrote a (still WIP) research document that can be found here.
Some platforms (Wayland, X11) always transfer bytes with a MIME type, other platforms have a set of standardised transferrable types. However, all types that are supported cross-platform (images, RTF, HTML, plaintext, URIs/URI lists) are cleanly expressible using MIME types on all supported platforms. As part of this PR, I've written up a quick-and-dirty summary of which types are supported on different platforms.
Design
The new API is inspired by the browser's
DataTransferAPI. The main complexity comes from supporting both the common set of capabilities (the types and traits inwinit-core/src/data_transfer.rs) while also allowing a consumer to use the platform-specific APIs.The design may look somewhat complex, and I am open to suggestions for simplifying it, but OS drag-and-drop/clipboard/etc APIs are just fundamentally complex. Unfortunately, there's going to be a lot of complexity here no matter the implementation. This article by a Wayland maintainer describes it as "arguably one of the most complicated parts of the core Wayland protocol", and from researching the design for other platforms it seems like Wayland actually has the simplest API.
Current state
X11 and macOS are working, with the supported types in the example (string, HTML, file URIs) tested on my two machines:
Windows and Wayland are yet to be completed, but I have machines running both and so can test them.