RCTF2020
Switch PRO Controller
I bought a Switch PRO Controller!! It’s really cool!
Two files are provided:
- Packet capture of some form of USB HID device
- Screen recording of someone typing out a flag with a on-screen keyboard
The screen recording unfortunately, has the flag visually obscured. The objective here is thus to recover the button presses from the packet capture and identify the letters entered.
This thread suggests that the Switch Pro Controller is just transmitting Bluetooth HID data over USB. With that in mind, a look at reverse engineered documentation matches the data present in the packet capture.
We can thus extract the button press and release events from the packet capture:
#!/usr/bin/python3
import csv
from datetime import timedelta
# https://github.com/dekuNukem/Nintendo_Switch_Reverse_Engineering/blob/master/bluetooth_hid_notes.md#standard-input-report---buttons
btn_map_raw = """3 (Right) Y X B A SR SL R ZR
4 (Shared) Minus Plus R Stick L Stick Home Capture -- Charging Grip
5 (Left) Down Up Right Left SR SL L ZL"""
btn_map = []
for line in btn_map_raw.splitlines():
data = line.split("\t")
label = data[0][3:-1]
btn_map.append([f'{label}: {key}' for key in data[1:]])
def to_s(timestring):
td = timedelta(seconds=float(timestring))
return f'0{str(td)[:-3]}'
# tshark -r capture.pcapng -T fields -e frame.time_relative -e usb.capdata > time_capdata.txt
with open("time_capdata.txt", newline="") as f:
r = csv.reader(f, delimiter='\t')
with open("btn_press.srt", "w") as outfile:
p_btn = bytearray(3)
press_time = 0
caption_count = 0
for row in r:
time, capdata = row
# only interested in packets with usb.capdata
if len(capdata) == 0:
continue
raw_capdata = bytearray.fromhex(capdata)
# only interested in packets starting with 0x30
if raw_capdata[0] != 0x30:
continue
btn = raw_capdata[3:6]
if btn != p_btn:
change = bytes((now ^ prev) for now, prev in zip(btn, p_btn))
for i, b in enumerate(change):
for u in range(8):
# true means bit has changed
if b & (1<<u):
key = btn_map[i][u]
if btn[i] & (1<<u):
change = "Pressed"
press_time = time
else:
change = "Release"
caption_count += 1
outfile.write(f'{caption_count}\n{to_s(press_time)} --> {to_s(time)}\n{key}\n\n')
print(time, key, change)
p_btn = btn
The key press/release events are written out as a set of subtitle information, which can then be combined with the screen recording to identify when keys are selected on the on-screen keyboard.
The subtitle file has to be delayed by approximately 6 seconds because the packet capture starts 6s into the screen recording.
The flag RCTF{5witch_1s_4m4z1ng_m8dw65}
is then recovered (after replacing - with _).