This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 195647 - HTML Validation fails when PHP appears in URL's
Summary: HTML Validation fails when PHP appears in URL's
Status: REOPENED
Alias: None
Product: php
Classification: Unclassified
Component: Editor (show other bugs)
Version: 7.0
Hardware: All All
: P3 normal with 3 votes (vote)
Assignee: Ondrej Brejla
URL:
Keywords:
: 195712 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-02-18 09:57 UTC by ignitedfirestarter
Modified: 2014-03-21 15:21 UTC (History)
7 users (show)

See Also:
Issue Type: ENHANCEMENT
Exception Reporter:


Attachments
Screen shot of error (18.21 KB, image/png)
2011-02-18 09:57 UTC, ignitedfirestarter
Details
Testcase producing the bug (313 bytes, application/octet-stream)
2012-10-25 11:24 UTC, Anael
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ignitedfirestarter 2011-02-18 09:57:10 UTC
Created attachment 106143 [details]
Screen shot of error

Take, for example, the following piece of HTML:

<form id="form" action="<?= $this->submissionUrl ?>" method="post">


In a .phtml file this should not show HTML validation errors, however the following error is shown:


Error: Bad value "   " for attribute "action" on element "form": DOUBLE_WHITESPACE in PATH.
Syntax of IRI reference:
Any URL. For example: '/hello', '#canvas', or 'http://example.org/'. Characters should be represented in NFC and spaces should be escaped as '%20'.

From line 2, column 2; to line 2, column 44


The validator should be more leanient when dealing with in-line PHP

This happens on HTML 4.1 and HTML 5
Comment 1 Marek Fukala 2011-02-24 15:00:24 UTC
The problem here is that you are validating source of the page, not the result of the php execution. As such the validation can never be perfect. Currently the validation is disabled for all languages embedding html code so no such problems appear by default. You can enable it, but it is on your own risk and you should be aware of possible bottlenecks. 

I'm CCing PHP editor guys, so that may express their opinions. It is definitively possible to improve the behavior by doing some html-php heuristics, but such stuff should reside in php.
Comment 2 Marek Fukala 2011-02-24 15:15:38 UTC
*** Bug 195712 has been marked as a duplicate of this bug. ***
Comment 3 Martin Fousek 2011-02-24 17:22:46 UTC
You CCed more likely PHP editor guy and me, PHP user. :) Anyway my opinion is that there can be any PHP expression so create fully general heuristic for that seems to me impossible (you can use chain of anything - just the PHP interpreter would be strong enough). 

Another thing is that, PHP delims are creating spaces which are causing errors hints, so removing these spaces would pass thru more hints after then. But I really don't know if it wouldn't do more troubles on another places. Do you know reason of these spaces?
Comment 4 ignitedfirestarter 2011-02-24 19:18:13 UTC
>> Do you know reason of these spaces?

The spaces that we are talking about are just to make code easier to read (see screen shot).



I get that HTML validation can't work on PHP, my recommendation is that if I embed some PHP we could suppress errors for that particular block or tag.

So, if you have an HTML attribute with PHP inside it, then that attribute should pass HTML validation.

I think the <a> tag with a href attribute is a pretty simple case, the spaces are inside the PHP so realistically the HTML validator should see the embeded code and conclude that it doesn't have enough information to validate the attribute - it must therefore trust the developer and assume it to be correct (otherwise we get code littered with warnings)

There are of course other situations where this conclusion is not so easy to draw, for example:

<a <?= "href='somepage.php'" ?>>Click Here</a>

One would expect the validator to signify that the <a> tag is missing it's href (actually it wouldn't - because a tags don't need an href) but we know that the href is present. The validator cannot be sure of this because it cannot (and should not) be expected to deduce what a piece of PHP code will output. Should it say 'There is embedded code inside that tag, it might produce a valid href' or should it say 'I can't see an href, therefore there is no href'?

Sorry if I'm not really making myself clear, or if I'm rambling.
Comment 5 Martin Fousek 2011-02-25 07:30:15 UTC
By the spaces I thought spaces which are generated by <? ?> delims. It hasn't anything to do with you spaces inside PHP code. It's inner behavior of NetBeans that these delims are replaced by space.

But of course this wouldn't work for case of missing href for "a" tag. So your opinion disabling validations by embedded code would be a way. I don't know what would think about that ppisl, but at least I think that we could track it as an enhancement.
Comment 6 Marek Fukala 2011-02-25 09:48:37 UTC
Nice discussion about whitespaces guys, but a bit pointless I'm affraid ;-) The primary problem here is that, as I already mentioned, you cannot validate the php source for the html code validity until it actually runs. Even in this case the validity of the result depends on the state of the system and may vary significantly. There can be simple PHP files with almost only html content and only a few php code blocks which doesn't influence the structure of the html code much. In such cases one could possibly want to validate it. 

Currently the mechanism how one may enable/disable the html validation is far from ideal - you can only e/d it for the mime type or, if the validation for the mimetype is enabled, you can disable it for particular files. As a minimal solutuon, for the sake of the case mentioned above, it would be nice if one could enable the validation per file if globally disabled for the mimetype. 

As for the whitespaces, the problem is bit more complicated. There is something like provider of html code for php files, which extracts the html code and replaces the php places by some special marks. The validator then uses this artificial html source to validate and the errors are remapped back to the php code. Since the validator do not like the special marks in the code they are masked by whitespaces before the validation. And here is the reason why it complains about whitespaces in the place where actually is a piece of php. It has absolutely nothing to do with neither the whitespaces in the php code nor with whitespaces generated during the code execution.

There's the same problem with many other tepmplating languages like JSP, RHTML, GSP, JSF-facelets etc. 

I believe either some heavy heuristic with very inaccurate result or just disabling the validation for non-pure-html files is the solution. If one wants to try the former, he is welcomed.
Comment 7 dharkness 2011-04-06 18:43:17 UTC
Having the validator ignore all embedded PHP tags would be far better than the current situation. Sure, many tags would continue to fail validation (e.g. empty HREF attribute), but I'd prefer that to these bogus "double whitespace" failures.
Comment 8 Marek Fukala 2011-04-07 08:12:49 UTC
The html validator is disabled by default for PHP files. I believe it's a correct assumption that most of the PHP files simply cannot be properly validated for the html content. For the rest of the cases one may enable it by the hint at the very first line.
Comment 9 tormit 2011-04-09 16:34:06 UTC
Validator should behave this way: when it finds PHP tag in src attribute, then it should assume it as valid.
Comment 10 Marek Fukala 2011-04-11 15:02:58 UTC
Please read my comment #8
Comment 11 rgoldberg 2011-08-01 20:29:48 UTC
Is there any way that the PHP editor guys can add an option to the PHP options tab to consider inline attribute values as errors (current behavior), warnings, or always valid?

It would remove many spurious errors.
Comment 12 iainhallam 2012-03-16 07:41:32 UTC
(In reply to comment #6)
> As for the whitespaces, the problem is bit more complicated. There is something
> like provider of html code for php files, which extracts the html code and
> replaces the php places by some special marks. The validator then uses this
> artificial html source to validate and the errors are remapped back to the php
> code. Since the validator do not like the special marks in the code they are
> masked by whitespaces before the validation. And here is the reason why it
> complains about whitespaces in the place where actually is a piece of php. It
> has absolutely nothing to do with neither the whitespaces in the php code nor
> with whitespaces generated during the code execution.

Then that's the real problem. The HTML provider should not introduce white space to mask its marks; it should delete the mark characters before presenting the HTML for validation.

I don't know how the result passes back to the editor, but if NetBeans can put PHP back in place when the marks have been obliterated by white space, then surely it can work when the marks have been obliterated by no space?

Certainly don't try to validate the PHP results, but when removing the PHP, don't introduce other HTML errors that aren't anything to do with the entered PHP...
Comment 13 Ondrej Brejla 2012-04-20 10:34:07 UTC
Marek said everything and very clearly in comment #8 I think. Imho the current behavior is the most correct as it could for now. And as Martin said, it's definitely enhancement.
Comment 14 iainhallam 2012-04-20 11:38:50 UTC
OK; I personally don't mind that this is an enhancement, as long as it's on the radar somewhere. To reflect the fact that we seem to have a consensus that trying to parse anything inside the PHP is just going to lead to massive problems, should the name be changed to something like:

HTML validation should remove PHP entirely, not introduce white space
Comment 15 Anael 2012-10-25 11:24:47 UTC
Created attachment 126538 [details]
Testcase producing the bug

A testcase showing the bug as if there is or there isn't whitespace inside the PHP tags.

On the 7.3 release, HTML validator is enabled inside .php files .
Should be great to be solved :-)
Comment 16 nbsocko 2013-12-05 17:24:15 UTC
To say this behavior isn't a defect is problematic.  The following code should not throw an error, period:

<tag attribute="<?=$myAttr?>" />

NB 7.4 does throw a whitespace error on this code, and it should absolutely not do so.  Suggesting that programmers turn off HTML checking in PHP files is not a reasonable solution for this defect.  The proper fix is to look inside the attribute's quotation marks, and if a replacement tag is present, assume it is valid.  No programmer is expecting every possible database value, or even any database value to be evaluated; nor is that a reasonable argument for leaving this faulty behavior unfixed, or calling a fix "enhancement".

I use replacement tags within attributes all the time, and as of 7.4 there are false error reports all over my right sidebar.  This impedes the productivity of NB users.  Please fix it.
Comment 17 JustMeAgain 2014-03-21 15:17:49 UTC
I fully agree that this is an issue that needs a viable solution.

I work constantly with "off-the-shelf" shopping cart systems. Code validation seems to work great with one fatal flaw, these "bad value" warnings.

Take this section of code for example:

<form action="<?php echo $action; ?>">

NetBeans flags this as an error, when it absolutely should not. And because of this my left pane is littered with dozens of files marked with little red exclamation marks. 

What winds up happening is each and every one of these files gets checked over and over and over due to the error flag.

It makes it impossible to tell when an actual real error exists.

------------------------

Running NetBeans 7.3.1; the most current version available according to the updater.