The smart Trick of mamba paper That Nobody is Discussing
We modified the Mamba's interior equations so to accept inputs from, and Incorporate, two separate data streams. To the ideal of our understanding, Here is the to start with try and adapt the equations of SSMs to some eyesight undertaking like type transfer without necessitating almost every other module like cross-notice or custom made normalizati