When clicking the caption text "Enter a caption for this image (optional)", the cursor ends up in the middle of the text which makes it hard to see that you've actually clicked on the caption. The caption text is first removed when you start typing. A better behavior is that the caption text is removed when the caption is clicked. This makes it obvious that you actually can start typing.