With regards to the ESD chip:
The ESD chip is going to try to prevent transient overvoltages from persisting on the protected line (which is going to be a voltage-sensitive circuit). Since (if I understand the application and schematic correctly) this circuit has a microcontroller with a data pin directly exposed to an external connector, some form of ESD protection is going to be highly recommended, as modern microcontrollers are going to be extremely intolerant of surprisingly low overvoltages.
This could happen if, as you suggest, one end of the audio cable is unplugged and a transient voltage is applied (by touching the exposed connector end), while the other end is plugged in to an instance of the board. This applies whether or not the affected board is energized (and it looks like this ESD chip will function passively, as it claims to be effectively a pile of diodes).
You do in fact need protection on both halves of the board, since they can exist as physically separate devices (e.g. when stowed for transport).
The particular ESD chip you’re using has VCC on pin 2 5, GND on pin 5 2, and four interchangeable-looking functional pins on 1 / 4 / 5 3 / 6. As long as you orient the VCC and GND pins correctly, you should end up with a valid circuit.
With regards to decoupling capacitors:
The main purpose of these is to filter out high-frequency noise on the voltage rails. These do serve a factor in ESD protection (which involves huge instantaneous voltages but not all that much charge), but also guard against other transient voltage variances that might affect operation or reliability of the circuit.
Because the decoupling capacitors you’re looking at are part of the ESD solution (with regards specifically to voltage rails), I wouldn’t suggest omitting them.
Yes, it looks like that should be OK.