Skip to content

Commit b73068c

Browse files
committedApr 3, 2019
Consistently handle inline elements with spaces
This resolves some odd situations that can occur when there are inline elements that contain spaces in sentences. The first situation is when there is an element that includes a space between words, for example 'Test<span> </span>content'. This would previously have produced a two space result: 'Test content' because this element would have matched both leading and trailing whitespace tests. The second situation is when there is an element that includes a space outside the tests, which is the case of a non-breaking space character (unicode U+00A0), then the space is removed. An example of this is 'Test<span>&nbsp;</span>content' which would result in 'Testcontent' as this wouldn't match the tests for leading/trailing whitespace. This resolves these problems by changing the whitespace tests to use \s rather than a subset of space characters (which is consistent with the blank test [1]) and only allows a leading space if the test for both leading and trailing whitespace passes on a blank element. [1]: https://github.com/domchristie/turndown/blob/80297cebeae4b35c8d299b1741b383c74eddc7c1/src/node.js#L14
1 parent 80297ce commit b73068c

File tree

2 files changed

+19
-3
lines changed

2 files changed

+19
-3
lines changed
 

‎src/node.js

+5-3
Original file line numberDiff line numberDiff line change
@@ -22,13 +22,15 @@ function flankingWhitespace (node) {
2222
var trailing = ''
2323

2424
if (!node.isBlock) {
25-
var hasLeading = /^[ \r\n\t]/.test(node.textContent)
26-
var hasTrailing = /[ \r\n\t]$/.test(node.textContent)
25+
var hasLeading = /^\s/.test(node.textContent)
26+
var hasTrailing = /\s$/.test(node.textContent)
27+
var blankWithSpaces = node.isBlank && hasLeading && hasTrailing
2728

2829
if (hasLeading && !isFlankedByWhitespace('left', node)) {
2930
leading = ' '
3031
}
31-
if (hasTrailing && !isFlankedByWhitespace('right', node)) {
32+
33+
if (!blankWithSpaces && hasTrailing && !isFlankedByWhitespace('right', node)) {
3234
trailing = ' '
3335
}
3436
}

‎test/index.html

+14
Original file line numberDiff line numberDiff line change
@@ -888,6 +888,20 @@ <h2>This is a header.</h2>
888888
<pre class="expected">![](http://example.com/logo.png)</pre>
889889
</div>
890890

891+
<div class="case" data-name="text separated by a space in an element">
892+
<div class="input">
893+
<p>Foo<span> </span>Bar</p>
894+
</div>
895+
<pre class="expected">Foo Bar</pre>
896+
</div>
897+
898+
<div class="case" data-name="text separated by a non-breaking space in an element">
899+
<div class="input">
900+
<p>Foo<span>&nbsp;</span>Bar</p>
901+
</div>
902+
<pre class="expected">Foo Bar</pre>
903+
</div>
904+
891905
<!-- /TEST CASES -->
892906

893907
<script src="turndown-test.browser.js"></script>

0 commit comments

Comments
 (0)
Please sign in to comment.