You know what I hate? Paragraphs of capital (or all lowercase) letters.
The other day a coworker was looking to beautify a large quantity of data spread across some MySQL tables. I created this function to make his life easier:
DELIMITER $$ CREATE FUNCTION `f_sentence_case`(strInput TEXT, nMinimumLength INT) RETURNS TEXT DETERMINISTIC BEGIN DECLARE result TEXT; DECLARE LastSpace INT; DECLARE NextSpace INT; DECLARE NextSlash INT; DECLARE Word TEXT; DECLARE NewSentence INT; DECLARE PreviousCharacter CHAR(1); DECLARE TrimmedWord TEXT; DECLARE NumberOfSpaces INT; DECLARE Swap INT; SET strInput := CONCAT(strInput, ' '); SET result := ''; SET LastSpace := 1; SET NextSpace := LOCATE(' ', strInput, LastSpace + 1); SET NextSlash := LOCATE('/', strInput, LastSpace + 1); SET NextSpace := IF(LEAST(NextSlash, NextSpace) = 0, GREATEST(NextSlash, NextSpace), LEAST(NextSlash, NextSpace)); label1: WHILE NextSpace DO SET Word := SUBSTR(strInput, LastSpace, NextSpace - LastSpace); SET PreviousCharacter := SUBSTR(strInput, LastSpace - 1, 1); SET NewSentence := LastSpace = 1 OR (NewSentence AND PreviousCharacter = ' ') OR PreviousCharacter IN('.', '!', '?'); SET TrimmedWord := LTRIM(Word); SET NumberOfSpaces := LENGTH(Word) - LENGTH(TrimmedWord); SET Word := TrimmedWord; # Make it lowercase if it is all uppercase SET Word := IF(LENGTH(Word) >= nMinimumLength AND Word NOT REGEXP '[0-9]', IF(NewSentence, CONCAT(UCASE(SUBSTR(Word, 1, 1)), LCASE(SUBSTR(Word, 2, LENGTH(Word) - 1))), LCASE(Word) ), Word); SET result := CONCAT(result, REPEAT(' ', NumberOfSpaces), Word); SET Swap := LastSpace; SET LastSpace := NextSpace; SET NextSpace := LOCATE(' ', strInput, Swap + 1); SET NextSlash := LOCATE('/', strInput, Swap + 1); SET NextSpace := IF(LEAST(NextSlash, NextSpace) = 0, GREATEST(NextSlash, NextSpace), LEAST(NextSlash, NextSpace)); END WHILE label1; RETURN result; END$$ DELIMITER ;
What it does
It seems to perform generally as I hoped it would; which is to say that it formats text to be sentence case.
More specifically, it alters all the "words" (a set of non-numeric, non-whitespace characters) that are longer than the specified minimum length.
It changes the words to be all lowercase, unless they happen to be the first word after a punctuation mark (in which case the first character of the word is made uppercase).
Useage
To clean up a field so that it is formatted in sentence case (ignoring all words with less than 3 characters), simply run this query:
UPDATE `table` SET `field` = f_sentence_case(`field`, 3);
Other than that, my only specs for the query were for it to be functional and hopefully not break my brain when I went back to read it later. If anyone has any significant improvements to it, let me know!