Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

animus 3 - host keyboard is significantly slower than slave #34

Closed
BlackCapCoder opened this issue Feb 27, 2019 · 10 comments
Closed

animus 3 - host keyboard is significantly slower than slave #34

BlackCapCoder opened this issue Feb 27, 2019 · 10 comments

Comments

@BlackCapCoder
Copy link
Contributor

BlackCapCoder commented Feb 27, 2019

The host half of the keyboard is significantly slower than the slave. The reason for this is that Wire.requestFrom blocks. The wire library does not provide a way to not block, but there are alternative libraries.

Async1MSDelay really takes between 5 to 7 milliseconds on the host, which means that we lose large chunks of time. This is almost certainly the reason why I only had debounce issues on the slave half of my keyboard; the host is so slow that it is virtually impossible to experience bouncing.

@BlackCapCoder
Copy link
Contributor Author

BlackCapCoder commented Feb 27, 2019

Replacing Wire.requestFrom(8, 32) with Wire.requestFrom(8, 4) greatly reduces the issue. I assume the 32 byte buffer is to reduce the chance of the slave dropping keystrokes, but is this really needed? We are doing this once for every iteration in the main loop

@BlackCapCoder
Copy link
Contributor Author

This is not a particularly nice solution, but I solved this on my keyboard by patching wire specifically for samd. The requestFrom function is implemented as:

uint8_t TwoWire::requestFrom(uint8_t address, size_t quantity, bool stopBit)
{
  if(quantity == 0)
  {
    return 0;
  }

  size_t byteRead = 0;

  rxBuffer.clear();

  if(sercom->startTransmissionWIRE(address, WIRE_READ_FLAG))
  {
    // Read first data
    rxBuffer.store_char(sercom->readDataWIRE());

    // Connected to slave
    for (byteRead = 1; byteRead < quantity; ++byteRead)
    {
      sercom->prepareAckBitWIRE();                          // Prepare Acknowledge
      sercom->prepareCommandBitsWire(WIRE_MASTER_ACT_READ); // Prepare the ACK command for the slave
      rxBuffer.store_char(sercom->readDataWIRE());          // Read data and send the ACK
    }
    sercom->prepareNackBitWIRE();                           // Prepare NACK to stop slave transmission
    //sercom->readDataWIRE();                               // Clear data register to send NACK

    if (stopBit)
    {
      sercom->prepareCommandBitsWire(WIRE_MASTER_ACT_STOP);   // Send Stop
    }
  }

  return byteRead;
}

Most of the work is delegated over to this sercom class, so I split the operation up into four parts:

bool TwoWire::myStartTransmisson (uint8_t address) {
  return sercom->startTransmissionWIRE(address, WIRE_READ_FLAG);
}

void TwoWire::myEndTransmisson (bool stopBit) {
  sercom->prepareNackBitWIRE();                           // Prepare NACK to stop slave transmission
  //sercom->readDataWIRE();                               // Clear data register to send NACK

  if (stopBit)
  {
    sercom->prepareCommandBitsWire(WIRE_MASTER_ACT_STOP);   // Send Stop
  }
}

void TwoWire::preparePop () {
  sercom->prepareAckBitWIRE();                          // Prepare Acknowledge
  sercom->prepareCommandBitsWire(WIRE_MASTER_ACT_READ); // Prepare the ACK command for the slave
}

unsigned char TwoWire::pop () {
  return sercom->readDataWIRE();
}

And back in ModI2CHost.c I can now request just one byte at a time:

#define nread 8
byte i2Buffer[nread];
int i2Stage = 0;
void CModI2CHost::Loop(void)
{

   ...

    if (i2Stage == 0) {
      Wire.myStartTransmisson(8);
      i2Stage++;
    } else if (i2Stage < nread) {
      i2Buffer[i2Stage-1] = Wire.pop();
      Wire.preparePop();
      i2Stage++;
    } else {
      i2Buffer[nread-1] = Wire.pop();
      Wire.myEndTransmisson (false);
      i2Stage = 0;
    }

    if (i2Stage != 0) return;

    byte keyData[8];
    byte keyType[8];
    byte keyMode[8];
    byte keyX[8];
    byte keyY[8];
    byte keyIndex = 0;

    for (int i = 0; i < nread; i+=4) {
      keyX[keyIndex]    = i2Buffer[i+0] & 0x0f; // bitwix] = temp[0] >> 4;
      keyData[keyIndex] = i2Buffer[i+1];
      keyType[keyIndex] = i2Buffer[i+2];
      keyMode[keyIndex] = i2Buffer[i+3];
      keyIndex++;
    }

The calls are going to block until data is available, this just lets me do an iteration in between. I could probably spread the calls out even more, but this was good enough for me; I can animate my LEDs without a perceivable difference in speed.

@blahlicus
Copy link
Owner

Yeah I'm aware the master is running slower than the slave. My own rainbow fade code experienced the same issue. I didn't have time to tackle it because I've been dealing with Arbites and the configuration tools for updating the firmware lately.

Instead of editing wire, would changing how I2CHost works be better? Here's some rough pseudo code:

Basically instead of periodically asking for 32 bytes, it asks for 2 bytes instead with the following structure:

  • byte0: transmission type
  • byte1: transmission size

It then calls another requestFrom with the size that the slave requested. This would allow the slave to control the buffer size of the master.

It also allows the slave to specify package types, so for example it could send keystrokes with transmission type 1, and send its EEPROM data with transmission type 2 so that it doesn't confuse the master.

I've placed the code within the 1ms delay loop just to be safe, but I don't think that's necessary. Placing it inside the 1ms delay loop also causes the right side to be 1ms behind the left side.

void loop()
{
  if (Animus.Async1MSDelay()) // place in 1MSDelay loop so that it runs less often
  {
    ... // set layer/brightness code goes here
    
    Wire.requestFrom(8,2); // request 2 byte package
    if (Wire.available()) // first byte as message type
    {
      /* transmission mode table
      mode 0 = idle, do nothing
      mode 1 = normal keystroke package
      mode 2 = print EEPROM
      */
      byte transmissionMode = Wire.read()
      
      
      if (Wire.available()) // second byte as message length
      {
        byte transmissionSize = Wire.read()
        
        if (transmissionMode == 0 )
        [
        //do nothing
        ]
        else if (transmissionMode == 1) // confirm data type as keystrokes
        {
          // code below is the I2C code from existing ModI2CHost
          if (transmissionSize > 0)
          {
            Wire.requestFrom(8, transmissionSize); // request actual data
            // Do stuff here to receive actual package for the keystrokes
          }
        }
        else if (transmissionMode == 2)
        {
          // etc
        }
      }
    }
  }
}

@BlackCapCoder
Copy link
Contributor Author

BlackCapCoder commented Feb 27, 2019

It would not be better than a patch, but it would indeed speed things up a lot. Another option would be to have the guest be the I2C host so that it can push key state changes itself. I am not sure how we would handle the update logic that we have on the host, and we would run into issues if we ever wanted to introduce a 3-part keyboard (have you seen the usb escape keys for mac?).

Also, placing the host logic inside async1MSDelay does not matter unless the code blocks for less than 1ms, as it would fire every single time.

It is also not necessary to call Wire.available, as Wire.requestFrom guarantees that all the bytes you requested will be available after the call, which is why it blocks.

@BlackCapCoder
Copy link
Contributor Author

If we were to implement your idea then we could squeeze both the transmission type and the package size into a single byte - 3 bits for transmission type and 5 for package size would give us 8 and 32 different values respectively. We don't need to ask for the size of the EEPROM dump package as we know how big that one is.

@blahlicus
Copy link
Owner

Would it be possible to implement your fix at the animus level and not mess with upstream libraries? Ultimately, your fix works by requesting 1 byte per loop so that I2C is not taking the majority of the time right?

I see that my solution would temporarily slow down the master when lots of keys are pressed on the slave side, what if we have requestFrom placed outside of the 1ms loop and have it running requesting 1 byte per loop?

Or we could place a "detector" requestFrom inside the 1MS loop asking for one byte, bit shift it to 3 and 5 bit uints for mode and size respectively, then activate the requestFrom outside of the 1MS loop when mode is a specific value.

Something like this:

byte transmissionSize = 0;
byte transmissionType = 0;
byte transmissionIndex = 0;
byte slaveDataArray[32];
void loop()
{
  if (Animus.Async1MSDelay()) // place in 1MSDelay loop so that it runs less often
  {
    ... // set layer/brightness code goes here
    
    if (messageType == 0) // detector ping only done if there are no pending data from slave
    {
      Wire.requestFrom(8,1); // request 1 byte package
      if (Wire.available()) // first byte as message type
      {
        /* transmission mode table
        mode 0 = idle, do nothing
        mode 1 = normal keystroke package
        mode 2 = print EEPROM
        */
        byte input = Wire.read();
        byte transmissionType = input >> 6; // this is 11000000
        byte transmissionSize = input & 0x3F; // this is 00111111
        byte transmissionIndex = 0;
      }
    }
  }
  
  if (transmissionIndex < transmissionSize)
  {
    Wire.requestFrom(8,1);
    slaveDataArray[transmissionIndex] = Wire.read();
    transmissionIndex++;
  }
  else // all data received
  {
    if (transmissionType == 1) // this is a keystroke package
    {
      ...//insert code here to process slaveDataArray for keystrokes
    }
    else if (transmissionType == 2) // this is an EEPROM dump
    {
      ...//etc
    }

    // resets variables
    transmissionType = 0;
    transmissionSize = 0;
    transmissionIndex = 0;
  }
}

If we are requesting 1 byte at a time then it should make a difference placing the loop outside of the 1ms loops right? I realise the above solution would send ACK, start, and stop bits every loop instead of your solution where it is only sent when the entire package is sent, but would the speed difference be that major?

@blahlicus
Copy link
Owner

Or really we should switch to a better I2C library. I'm really not certain which way is the best to deal with this.

@BlackCapCoder
Copy link
Contributor Author

I am currently experimenting with toggling who is the master so we don't have to pay for requestFrom. The idea is that the guest starts off as the master and sends the keystrokes once the buffer is full, then it becomes the slave and waits for the host to send either an update, or an empty package to let it know that it should become the master again.

@blahlicus
Copy link
Owner

blahlicus commented Feb 27, 2019

Take note the host has to receive PressCoord, then send the current layer to the guest, then ask the guest for the correct keyValue and keyType to deal with LayerState (See line 114 in original ModI2CHost.cpp)

This is especially annoying if the guest presses an FN key because the layer switching is done by the host, you might end up having to ask for the current layer between each key unless you let the I2C master handle layering regardless of who is the host.

The SAMD21 chip has large enough storage that we could store the entire layout in the host and have the slave only send X Y coordinates, but if we do that, then we will half the already meager 1024 byte EEPROM on the Atmega32u4 platforms for storing layouts.

Maybe I could work on platform specific PersMem modules and have a wrapper around that instead if that's really desirable.

@BlackCapCoder
Copy link
Contributor Author

Holy shit that was a headache! I'll send a PR once the old one has been processed

BlackCapCoder added a commit to BlackCapCoder/animus-family that referenced this issue Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants